I\'m just reading the C++ concurrency in action book by Anthony Williams. There is this classic example with two threads, one produce data, the other one consumes the data a
A "classic" bool
, as you put it, would not work reliably (if at all). One reason for this is that the compiler could (and most likely does, at least with optimizations enabled) load data_ready
only once from memory, because there is no indication that it ever changes in the context of reader_thread
.
You could work around this problem by using volatile bool
to enforce loading it every time (which would probably seem to work) but this would still be undefined behavior regarding the C++ standard because the access to the variable is neither synchronized nor atomic.
You could enforce synchronization using the locking facilities from the mutex header, but this would introduce (in your example) unnecessary overhead (hence std::atomic
).
The problem with volatile
is that it only guarantees that instructions are not omitted and the instruction ordering is preserved. volatile
does not guarantee a memory barrier to enforce cache coherence. What this means is that writer_thread
on processor A can write the value to it's cache (and maybe even to the main memory) without reader_thread
on processor B seeing it, because the cache of processor B is not consistent with the cache of processor A. For a more thorough explanation see memory barrier and cache coherence on Wikipedia.
There can be additional problems with more "complex" expressions then x = y
(i.e. x += y
) that would require synchronization through a lock (or in this simple case an atomic +=
) to ensure the value of x
does not change during processing.
x += y
for example is actually:
x
x + y
x
If a context switch to another thread occurs during the computation this can result in something like this (2 threads, both doing x += 2
; assuming x = 0
):
Thread A Thread B
------------------------ ------------------------
read x (0)
compute x (0) + 2
read x (0)
compute x (0) + 2
write x (2)
write x (2)
Now x = 2
even though there were two += 2
computations. This effect is known as tearing.