问题
I have a question about the use of lock free queues.
Suppose I have a single-producer single-consumer queue, where the producer and consumer are bound to separate cores. The queue elements are buffers of shared memory, which is mmapped by both producer and consumer at the start.
The producer gets a queue element, populates the buffer with data and enqueues it, and the consumer dequeues the element, reads it and processes it in some fashion.
Do I, as a user of the lock-free queue, have to explicitly ensure that the buffer written by the producer is visible to the user? Or does the CAS (or other similar) primitive at the heart of the algorithm automatically provide the barrier?
The couple of examples that I have seen use integers as the payload, so this question of memory synchronization does not arise.
Thanks,
回答1:
This by definition is architecture-specific. For GCC on Intel CPUs use the GCC Atomic Builins - most of them imply full memory barrier.
回答2:
A few archiectures bind memory barriers to CAS; x86/x64 is one.
Others (ARM, for example) do not. On ARM, you perform an LL/SC, with a manual data-only memory barrier before and after.
回答3:
Interlocked primitives such as compare-and-swap generally come in variants with different memory barrier semantics. They vary somewhat between architectures, but generally you'll want to have "release" semantics on (at the latest) the operation in the producer that makes the structure visible to the consumer, and something with "acquire" semantics in the consumer before you access the data structure.
On some architectures (notably vanilla x86) you don't actually get a choice because every interlocked operation implies a full barrier -- but if that gets you into the habit of not requesting any barrier it will, due to Murphy, come back and bite you on some other architecture.
(Conversely, and also due to Murphy, if you meticulously research the options and insert the right barriers everywhere, events will probably conspire to make it such that no code you ever write will need to run on anything but x86).
回答4:
Do I, as a user of the lock-free queue, have to explicitly ensure that the buffer written by the producer is visible to the user? Or does the CAS (or other similar) primitive at the heart of the algorithm automatically provide the barrier?
Semantically, pushing data to a concurrent queue should have at least a release fence, and popping from the queue should have at least an acquire fence. I believe a good lock-free implementation should not impose caring about memory fences etc. onto its users. Even if the lock-free algorithm does not automatically put fences into proper places (or not for every architecture), the implementation, not users, must ensure correct visibility and ordering. Users should only care about choosing an implementation that "just works" (which may/must include testing that it really works) :)
来源:https://stackoverflow.com/questions/6975263/question-about-lock-free-queues