Cuda char* variable assignment

天涯浪子 提交于 2019-12-11 03:19:35

问题


This is a follow up question to the selected answer in this post: Output of cuda program is not what was expected.

While the below functions works:

__global__ void setVal(char **word)
{

    char *myWord = word[(blockIdx.y * gridDim.x) + blockIdx.x];
    myWord[0] = 'H';
    myWord[1] = 'e';
    myWord[2] = 'l';
    myWord[3] = 'l';
    myWord[4] = 'o';
}

Why does not this work?

__global__ void setVal(char **word)
{

    char *myWord = word[(blockIdx.y * gridDim.x) + blockIdx.x];
    myWord = "Hello\0";

}

回答1:


You should start paying much more attention to the output from the compiler. Your second kernel code:

__global__ void setVal(char **word)
{
    char *myWord = word[(blockIdx.y * gridDim.x) + blockIdx.x];
    myWord = "Hello\0";
}

compiles to a null kernel with nothing inside it:

$ nvcc -arch=sm_20 -c nullkernel.cu 
nullkernel.cu(3): warning: variable "myWord" was set but never used

nullkernel.cu(3): warning: variable "myWord" was set but never used

The reason why is because what you think is a string copy assignment is really just a pointer assignment, and in this case the compiler is smart enough to know that myWord isn't written to memory, so it just eliminates all the code and warns you that myWord isn't used.

If I were to ask a rhetorical question and re-write the code this way:

__global__ void setVal(char **word)
{

    char *myWord = word[(blockIdx.y * gridDim.x) + blockIdx.x];
    const char[] mymsg = "Hello\0";
    myWord = mymsg;
}

would be more obvious both why the code doesn't compile and why it could never "implicitly" perform a string copy assignment even if it did compile?




回答2:


In your second version, myWord = "Hello\0";, the "Hello\0" is not stored in the space given by the **word parameter. The string is stored probably in the .rodata section of the executable. The assignment simply updates the myWord pointer -- it does NOT do any bulk copying of data. (Though as talonmies points out, the compiler can figure out that the pointer update isn't needed at all, and optimizes away the entire function. Neat.)

In general, C doesn't provide any easy bulk-data copy mechanisms built into the language -- the designers thought expensive things should look expensive. So, while PL/I makes assigning 0 to every element in a multidimensional array a very easy operation: A = 0;, C forces nested for() loops with memset() operations in the inner-most loop, to drive home the idea that it is expensive.

(Copying struct elements into a function parameter is the only exception to the bulk-copy rule.)



来源:https://stackoverflow.com/questions/6566910/cuda-char-variable-assignment

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!