Difference between global and device functions

后端 未结 9 1607
渐次进展
渐次进展 2021-01-29 20:10

Can anyone describe the differences between __global__ and __device__ ?

When should I use __device__, and when to use __glob

相关标签:
9条回答
  • 2021-01-29 20:37

    I am recording some unfounded speculations here for the time being (I will substantiate these later when I come across some authoritative source)...

    1. __device__ functions can have a return type other than void but __global__ functions must always return void.

    2. __global__ functions can be called from within other kernels running on the GPU to launch additional GPU threads (as part of CUDA dynamic parallelism model (aka CNP)) while __device__ functions run on the same thread as the calling kernel.

    0 讨论(0)
  • 2021-01-29 20:44

    __global__ function is the definition of kernel. Whenever it is called from CPU, that kernel is launched on the GPU.

    However each thread executing that kernel, might require to execute some code again and again, for example swapping of two integers. Thus, here we can write a helper function, just like we do in a C program. And for threads executing on GPU, a helper function should be declared as __device__.

    Thus, a device function is called from threads of a kernel - one instance for one thread . While, a global function is called from CPU thread.

    0 讨论(0)
  • 2021-01-29 20:46
    1. __global__ - Runs on the GPU, called from the CPU or the GPU*. Executed with <<<dim3>>> arguments.
    2. __device__ - Runs on the GPU, called from the GPU. Can be used with variabiles too.
    3. __host__ - Runs on the CPU, called from the CPU.

    *) __global__ functions can be called from other __global__ functions starting
    compute capability 3.5.

    0 讨论(0)
  • 2021-01-29 20:50

    __global__ is for cuda kernels, functions that are callable from the host directly. __device__ functions can be called from __global__ and __device__ functions but not from host.

    0 讨论(0)
  • 2021-01-29 20:50

    __global__ is a CUDA C keyword (declaration specifier) which says that the function,

    1. Executes on device (GPU)
    2. Calls from host (CPU) code.

    global functions (kernels) launched by the host code using <<< no_of_blocks , no_of threads_per_block>>>. Each thread executes the kernel by its unique thread id.

    However, __device__ functions cannot be called from host code.if you need to do it use both __host__ __device__.

    0 讨论(0)
  • 2021-01-29 20:52

    Global functions are also called "kernels". It's the functions that you may call from the host side using CUDA kernel call semantics (<<<...>>>).

    Device functions can only be called from other device or global functions. __device__ functions cannot be called from host code.

    0 讨论(0)
提交回复
热议问题