发表新帖

发表新帖

atomicMax + AtomicCAS(atomicExch)

前端未结

关注

 1  1880

I would like to ask to you guys if there is a better way to combine 2 atomics.

My goal is to find the highest results for a set of K equations (more than 32) under a lis

相关标签:

1条回答

执念已碎

2021-01-24 23:04
there is any issue in the following code?

Yes, you can't use two atomics like that and expect coherent results. You have set up a possible race condition.

Suppose thread A does the atomicMax and replaces the old value with 100. Then thread B does the atomicMax and replaces the 100 value with 110. Then suppose thread B does the atomicCAS, and replaces its index. Then thread A does the atomicCAS, and replaces thread B index with thread A index. You now have a max value of 110 with an index corresponding to thread A.

Even within a single warp, there is no stated order of execution of atomic operations.

Is there a better way?
1. since your values are both 32-bit quantities, you might be interested in using a custom 64-bit atomic operation like this to update a value and an index at the same time, atomically.
2. For large scale usage (lots of threads) you may want to explore a classical paraellel reduction. There are questions here on the CUDA tag such as this one and this one that discuss how to do an index+value reduction.
Global atomics on Kepler are pretty fast, so depending on your exact code and reduction "density" a global atomic reduction might not be a big problem performance-wise.
0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题