How to make a kernel function which callable from both the host and device?

前端 未结 2 1801
忘掉有多难
忘掉有多难 2021-01-13 15:58

The following trial presents my intention, which failed to compile:

__host__ __device__ void f(){}

int main()
{
    f<<<1,1>>>();
}


        
相关标签:
2条回答
  • 2021-01-13 16:09

    The tutorial you are looking at is so old, 2008? It might not be compatible with the version of CUDA you are using.

    You can use __global__ and that means __host__ __device__, this works:

    __global__ void f()
    {
        const int tid = threadIdx.x + blockIdx.x * blockDim.x;
    }
    
    int main()
    {
        f<<<1,1>>>();
    }
    
    0 讨论(0)
  • 2021-01-13 16:17

    You need to create a CUDA kernel entry point, e.g. __global__ function. Something like:

    #include <stdio.h>
    
    __host__ __device__ void f() {
    #ifdef __CUDA_ARCH__
        printf ("Device Thread %d\n", threadIdx.x);
    #else
        printf ("Host code!\n");
    #endif
    }
    
    __global__ void kernel() {
       f();
    }
    
    int main() {
       kernel<<<1,1>>>();
       if (cudaDeviceSynchronize() != cudaSuccess) {
           fprintf (stderr, "Cuda call failed\n");
       }
       f();
       return 0;
    }
    
    0 讨论(0)
提交回复
热议问题