Can anyone describe the differences between __global__ and __device__ ?
__global__
__device__
When should I use __device__, and when to use __glob
__glob
__global__ is for cuda kernels, functions that are callable from the host directly. __device__ functions can be called from __global__ and __device__ functions but not from host.