I am doing some tests and I realized that using the -G parameter when compiling is giving me a bad performance than without it.
I have checked the documentation in Nvidi
Using the -G
switch disables most compiler optimizations that nvcc
might do in device code. The resulting code will often run slower than code that is not compiled with -G
, for this reason.
This is pretty easy to see by running your executable in each case through cuobjdump -sass myexecutable
and looking at the generated device code. You'll see generally less device code in the non -G
case, and you can see the differences in specific optimizations as well.
One of the reasons for this is that highly optimized device code may eliminate actual lines of source code and actual source code variables. This can make it very difficult to debug code. Therefore to enable debugging, most optimizations are disabled with -G
.
Also note that with Thrust, using the -G
switch may result in unpredictable behavior. Newer versions of thrust should behave better, but there may still be unexpected issues when compiling thrust code with -G
.