cudaMemset fails on device variable

后端未结

关注

 2  537

北荒 2021-01-25 04:51

I am having trouble using cudaMemset on a device variable. Is it possible to use the reference to the device variable for cudaMemset, or is it just a m

2条回答

旧时难觅i (楼主)

2021-01-25 05:22
Your problem is that d_test (as it appears in the host symbol table) isn't a valid device address and the runtime cannot access it directly. The solution is to use the cudaGetSymbolAddress API function to read the address of the device symbol from the context at runtime. Here is a slightly expanded version of your demonstration case which should work correctly:
```
#include 
#include 
#include 

// device variable and kernel
__device__ float d_test;

inline void gpuAssert(cudaError_t code, char * file, int line, bool Abort=true)
{
    if (code != cudaSuccess) {
        fprintf(stderr, "GPUassert: %s %s %d\n", cudaGetErrorString(code),file,line);
        if (Abort) exit(code);
    }       
}

#define gpuErrchk(ans) { gpuAssert((ans), __FILE__, __LINE__); }

int main()
{

    float * _d_test;

    gpuErrchk( cudaFree(0) );
    gpuErrchk( cudaGetSymbolAddress((void **)&_d_test, "d_test") );
    gpuErrchk( cudaMemset(_d_test,0,sizeof(float)) );

    gpuErrchk( cudaThreadExit() );

    return 0;
}
```
Here, we read the address of the device symbol d_test from the context into a host pointer _d_test. This can then be passed to host side API functions like cudaMemset, cudaMemcpy, etc.

Edit to note that the form of cudaGetSymbolAddress shown in this answer has been deprecated and removed from the CUDA runtime API. For modern CUDA, the call would be:
```
gpuErrchk( cudaGetSymbolAddress((void **)&_d_test, d_test) );
```
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...

cudaMemset fails on __device__ variable

cudaMemset fails on device variable