How to make a kernel function which callable from both the host and device?

前端未结

关注

 2  1801

The following trial presents my intention, which failed to compile:

__host__ __device__ void f(){}

int main()
{
    f<<<1,1>>>();
}

相关标签:

2条回答

清歌不尽

2021-01-13 16:09
The tutorial you are looking at is so old, 2008? It might not be compatible with the version of CUDA you are using.

You can use __global__ and that means __host__ __device__, this works:
```
__global__ void f()
{
    const int tid = threadIdx.x + blockIdx.x * blockDim.x;
}

int main()
{
    f<<<1,1>>>();
}
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

情歌与酒

2021-01-13 16:17

You need to create a CUDA kernel entry point, e.g. __global__ function. Something like:

#include <stdio.h>

__host__ __device__ void f() {
#ifdef __CUDA_ARCH__
    printf ("Device Thread %d\n", threadIdx.x);
#else
    printf ("Host code!\n");
#endif
}

__global__ void kernel() {
   f();
}

int main() {
   kernel<<<1,1>>>();
   if (cudaDeviceSynchronize() != cudaSuccess) {
       fprintf (stderr, "Cuda call failed\n");
   }
   f();
   return 0;
}

0 讨论(0)