问题
I have a simple kernel, in which I'm allocating some space using malloc, simply as:
__global__ void chainKernel() {
float* __restrict__ boo = (float*)malloc(sizeof(float));
*boo = 0;
*boo = *boo + 100;
return;
}
If I put a breakpoint on *boo = *boo + 100
I can't see the contents of *boo. Instead I get Operation is not valid due to the current state of the object
next to the variable in the debugger window. If I remove the __restrict__
however, the value is shown correctly. Is this normal behavior?
My system: CUDA 5.5.20, Nsight 3.1.0.13141, Windows 7 x64, VS2010, GeForce GTX Titan.
回答1:
One of the benefits of __restrict__
is that it allows the compiler to be more aggressive with optimizations. When you have simple code like this that the compiler can completely optimize away, the __restrict__
keyword may help the compiler do just that.
One of the common reasons for not being able to inspect variables in the debugger is due to compiler optimizations, either locally (a variable going out of scope when you weren't expecting it) or globally (a variable that has been completely optimized away).
Note that the definition of the kernel you've shown in this question does nothing useful. Therefore the compiler may be optimizing things away.
To work around this (for this case), put a printf("%f", *boo);
statement immediately after the final boo
assignment, and the compiler will not be able to optimize the variable away. You should also be using the -G
switch for debugging.
来源:https://stackoverflow.com/questions/18333124/cuda-debugging-with-vs-cant-examine-restrict-pointers-operation-is-not-v