GPU Memory not freeing itself after CUDA script execution

问题

I am having an issue with my Graphics card retaining memory after the execution of a CUDA script (even with the use of cudaFree()).

On boot the Total Used memory is about 128MB but after the script runs it runs out of memory mid execution.

nvidia-sma:

  +------------------------------------------------------+                       
| NVIDIA-SMI 340.29     Driver Version: 340.29         |                       
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 660 Ti  Off  | 0000:01:00.0     N/A |                  N/A |
| 10%   43C    P0    N/A /  N/A |   2031MiB /  2047MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Compute processes:                                               GPU Memory |
|  GPU       PID  Process name                                     Usage      |
|=============================================================================|
|    0            Not Supported                                               |
+-----------------------------------------------------------------------------+

Is there any way to free this memory back up without rebooting, perhaps a terminal command?

Also is this normal behaviour if I am not managing my memory correctly in a CUDA script, or should this memory be automatically freeing itself when the script stops / is quit?

回答1:

The CUDA runtime API automatically registers a teardown function which will destroy the CUDA context and release any GPU resources which the application was using. As long as the application implicitly or explicitly calls exit(), then no further user action is required free resources like GPU memory.

If you do find that memory doesn't seem to be released when running a CUDA code, then the usual suspect is suspended or background instances of that or other code which has never called exit() and never destroyed their context. That was the cause in this case.

NVIDIA do provide an API function cudaDeviceReset, which will initiate context destruction at the time of the call. It shouldn't usually be necessary to use this function in well designed CUDA code, rather you should try and ensure that there is a clean exit() or return path from main() in your program. This will ensure that the context destruction handler which the runtime library is called and resources are freed.

来源：https://stackoverflow.com/questions/29472093/gpu-memory-not-freeing-itself-after-cuda-script-execution

标签

Linux

cuda

gpu

nvidia

linux-mint