Using atomic arithmetic operations in CUDA Unified Memory multi-GPU or multi-processor

后端 未结 1 373
心在旅途
心在旅途 2021-01-23 22:18

I am trying to implement a CUDA program that uses Unified Memory. I have two unified arrays and sometimes they need to be updated atomically.

The question below has an

相关标签:
1条回答
  • 2021-01-23 22:42

    To summarize comments into an answer:

    • You can perform this sort of address space wide atomic operation using atomicAdd_system
    • However, you can only do this on compute capability 6.x or newer devices (7.2 or newer if using Tegra)
    • specifically this means you have to compile for the correct compute capability such as -arch=sm_60 or similar
    • You state in the question you are using Telsa K20 cards -- these are compute capability 3.5 and do not support any of the system wide atomic functions.

    As always, this information is neatly summarized in the relevant section of the Programming Guide.

    0 讨论(0)
提交回复
热议问题