Average latency of atomics cmpxchg instructions on Intel Cpus

前端 未结 5 465
独厮守ぢ
独厮守ぢ 2020-12-31 16:09


I am looking for some reference on average latencies for lock cmpxchg instruction for various intel processors. I am not able to locate any good reference on the topic

5条回答
  •  迷失自我
    2020-12-31 16:26

    I've been looking into exponential backoff for a few months now.

    The latency of CAS is utterly dominated by whether or not the instruction can operate from cache or has to operate from memory. Typically, a given memory address is being CAS'd by a number of threads (say, an entry pointer to a queue). If the most recent successful CAS was performed by a logical processor which shares a cache with the current CAS executer (L1, L2 or L3, although of course the higher levels are slower) then the instruction will operate on cache and will be fast - a few cycles. If the most recent successful CAS was performed by a logical core which does not share a cache with the current excutor, then the write of the most recent CASer will have invalidated the cache line for the current executor and a memory read is required - this will take hundreds of cycles.

    The CAS operation itself is very fast - a few cycles - the problem is memory.

提交回复
热议问题