Can anyone describe the differences between __global__
and __device__
?
When should I use __device__
, and when to use __glob
__global__
function is the definition of kernel. Whenever it is called from CPU, that kernel is launched on the GPU.
However each thread executing that kernel, might require to execute some code again and again, for example swapping of two integers. Thus, here we can write a helper function, just like we do in a C program. And for threads executing on GPU, a helper function should be declared as __device__
.
Thus, a device function is called from threads of a kernel - one instance for one thread . While, a global function is called from CPU thread.