I read somewhere (can\'t find the page anymore) that lock free data structures are more efficient \"for certain workloads\" which seems to imply that sometimes they\'re actually
A common approach to implementing a lock-free data structure is to have a mutable reference to an immutable object, and have anything that wants to change the structure grab the reference, produce a new version of the object with suitable changes applied, and then CompareExchange the reference to point to the new object. If the CompareExchange works, great. If not, ditch the new object, re-grab the reference, and start over.
This can work well if producing the new object is cheap and the level of contention is low enough that the CompareExchange will usually work. If there is considerable contention, and if producing the new object is slow, simultaneous attempted updates by N threads may take N^2 time to complete. As an extreme example, suppose 100 threads are running on a CPU, an update takes 100ms of CPU time (just over a time-slice), and all 100 threads attempt to update an object at once. During the first ten seconds, each thread will produce a new object based on the original one. One of the threads will successfully do the CompareExchange, while the others will all fail. Then during the next 9.9 seconds, 99 threads will generate new versions of the object, after which one will successfully post its update and 98 will fail. The net effect will be that the lock-free method will take 505 seconds' worth of CPU time to perform 100 updates, when a locking system could have done them in about 10 seconds.