问题
This is a follow up question to an earlier question.
From the discussion, the mmc code (https://github.com/fangq/mmc) appears to be fine, and the memory was properly released when running on Intel CPU and AMD GPU. However, on NVIDIA GPU, valgrind
reported significant memory leak, so was the test. Every time after a cycle of creating and releasing a GPU context, the memory kept increasing.
You can see this result in the below memory (blue line) profiling report.
Here is the test and commands to reproduce the issue (need to run this on NVIDIA GPUs):
git clone https://github.com/fangq/mmc.git
cd mmc/src
sed -i -e 's/mmc_init_from_cmd/for(int i=0;i<5;i++){\nmmc_init_from_cmd/g' mmc.c
sed -i -e 's/return/getchar();}\nreturn/g' mmc.c
make clean
make all
cd ../examples/validation
../../src/bin/mmc -f cube2.inp -G 1 -s cube2 -n 1e4 -b 0 -D TP -M G -F bin
run ../../src/bin/mmc -L
to list GPUs, use -G #
to specify which GPU to use.
as you will see, the simulation will repeat 5 times, separated by enter keys. You can start a memory monitor, like top
command in Linux, and see the increasing memory allocation after each repetition.
I googled and found multiple previous reports on OpenCL memory leaks, but I did not find an solution. I would like to know if if there any trick to force NVIDIA OpenCL driver to clean up memory after each run. I am asking this because mmc has a MATLAB/Octave mex function which can be called multiple times, and this issue could lead to large memory usage after multiple calls.
来源:https://stackoverflow.com/questions/61163373/how-to-force-nvidia-opencl-to-release-gpu-context-to-avoid-memory-leak