atomicInc() is not working

谁都会走 提交于 2019-12-13 06:47:29

问题


I have tried below program using atomicInc().

__global__ void ker(int *count)
{
    int n=1;
    int x = atomicInc ((unsigned int *)&count[0],n);
    CUPRINTF("In kernel count is %d\n",count[0]);
}

int main()
{
    int hitCount[1];
    int *hitCount_d;

    hitCount[0]=1;
    cudaMalloc((void **)&hitCount_d,1*sizeof(int));

    cudaMemcpy(&hitCount_d[0],&hitCount[0],1*sizeof(int),cudaMemcpyHostToDevice);

    ker<<<1,4>>>(hitCount_d);

    cudaMemcpy(&hitCount[0],&hitCount_d[0],1*sizeof(int),cudaMemcpyDeviceToHost);

    printf("count is %d\n",hitCount[0]);
  return 0;
}

Output is:

In kernel count is 1
In kernel count is 1
In kernel count is 1
In kernel count is 1

count is 1

I'm not understanding why it is not incrementing. Can anyone help


回答1:


Referring to the documentation, atomicInc does this:

for the following:

atomicInc ((unsigned int *)&count[0],n);

compute:

((count[0] >= n) ? 0 : (count[0]+1))

and store the result back in count[0]

(If you're not sure what the ? operator does, look here)

Since you've passed n = 1, and count[0] starts out at 1, atomicInc never actually increments the variable count[0] beyond 1.

If you want to see it increment beyond 1, pass a larger value for n.

The variable n actually acts as a "rollover value" for the incrementing process. When the variable to be incremented actually reaches the value of n, the next atomicInc will reset it to zero.

Although you haven't asked the question, you might ask, "Why do I never see a value of zero, if I am hitting the rollover value?"

To answer this, you must remember that all 4 of your threads are executing in lockstep. All 4 of them execute the atomicInc instruction before any execute the subsequent print statement.

Therefore we have a variable of count[0] which starts out at 1.

  1. The first thread to execute the atomic resets it to zero.
  2. The next thread increments it to 1.
  3. The third thread resets it to zero.
  4. The fourth and final thread increments it to 1.

Then all 4 threads print out the value.

As another experiment, try launching 5 threads instead of 4, see if you can predict what the value printed out will be.

ker<<<1,5>>>(hitCount_d);

As @talonmies indicated in the comments, if you swap your atomicInc for an atomicAdd:

int x = atomicAdd ((unsigned int *)&count[0],n);

You'll get results that you were probably expecting.



来源:https://stackoverflow.com/questions/18008975/atomicinc-is-not-working

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!