atomicMax + AtomicCAS(atomicExch)

前端 未结 1 1880
鱼传尺愫
鱼传尺愫 2021-01-24 22:20

I would like to ask to you guys if there is a better way to combine 2 atomics.

My goal is to find the highest results for a set of K equations (more than 32) under a lis

相关标签:
1条回答
  • 2021-01-24 23:04

    there is any issue in the following code?

    Yes, you can't use two atomics like that and expect coherent results. You have set up a possible race condition.

    Suppose thread A does the atomicMax and replaces the old value with 100. Then thread B does the atomicMax and replaces the 100 value with 110. Then suppose thread B does the atomicCAS, and replaces its index. Then thread A does the atomicCAS, and replaces thread B index with thread A index. You now have a max value of 110 with an index corresponding to thread A.

    Even within a single warp, there is no stated order of execution of atomic operations.

    Is there a better way?

    1. since your values are both 32-bit quantities, you might be interested in using a custom 64-bit atomic operation like this to update a value and an index at the same time, atomically.

    2. For large scale usage (lots of threads) you may want to explore a classical paraellel reduction. There are questions here on the CUDA tag such as this one and this one that discuss how to do an index+value reduction.

    Global atomics on Kepler are pretty fast, so depending on your exact code and reduction "density" a global atomic reduction might not be a big problem performance-wise.

    0 讨论(0)
提交回复
热议问题