Usage of global vs. constant memory in CUDA

前端 未结 2 1792
暖寄归人
暖寄归人 2021-01-03 04:02

Hey there, I have the following piece of code:

#if USE_CONST == 1
    __constant__ double PNT[ SIZE ];    
#else
    __device__ double *PNT;
#endif


        
相关标签:
2条回答
  • 2021-01-03 05:00

    Although this is an old question I add this for future googlers:

    The problem is here:

    cudaMalloc((void **)&PNT, sizeof(double)*SIZE);
    cudaMemcpy(PNT, point, sizeof(double)*SIZE, cudaMemcpyHostToDevice);
    

    The cudaMalloc writes to the host version of PNT which is actually a device variable that must not be accessed from host. So correct would be to allocate memory, copy the address to the device symbol and copy the memory to the the memory pointed to by that symbol:

    void* memPtr;
    cudaMalloc(&memPtr, sizeof(double)*SIZE);
    cudaMemcpyToSymbol(PNT, &memPtr, sizeof(memPtr));
    // In other places you'll need an additional:
    // cudaMemcpyFromSymbol(&memPtr, PNT, sizeof(memPtr));
    cudaMemcpy(memPtr, point, sizeof(double)*SIZE, cudaMemcpyHostToDevice);
    

    Easier would be:

    #if USE_CONST == 1
        __constant__ double PNT[ SIZE ];    
    #else
        __device__ double PNT[ SIZE ];
    #endif
    
    // No #if required anymore:
    cudaMemcpyToSymbol(PNT, point, sizeof(double)*SIZE);
    
    0 讨论(0)
  • 2021-01-03 05:01

    The correct usage of cudaMemcpyToSymbol prior to CUDA 4.0 is:

    cudaMemcpyToSymbol("PNT", point, sizeof(double)*SIZE)
    

    or alternatively:

    double *cpnt;
    cudaGetSymbolAddress((void **)&cpnt, "PNT");
    cudaMemcpy(cpnt, point, sizeof(double)*SIZE, cudaMemcpyHostToDevice);
    

    which might be a bit faster if you are planning to access the symbol from the host API more than once.

    EDIT: misunderstood the question. For the global memory version, do something similar to the second version for constant memory

    double *gpnt;
    cudaGetSymbolAddress((void **)&gpnt, "PNT");
    cudaMemcpy(gpnt, point, sizeof(double)*SIZE.  cudaMemcpyHostToDevice););
    
    0 讨论(0)
提交回复
热议问题