C++ : std::atomic and volatile bool

后端 未结 3 2068
一向
一向 2020-12-30 02:55

I\'m just reading the C++ concurrency in action book by Anthony Williams. There is this classic example with two threads, one produce data, the other one consumes the data a

相关标签:
3条回答
  • 2020-12-30 03:10

    Ben Voigt's answer is completely correct, still a little theoretical, and as I've been asked by a colleague "what does this mean for me", I decided to try my luck with a little more practical answer.

    With your sample, the "simplest" optimization problem that could occur is the following:

    According to the Standard, an optimized execution order may not change the functionality of a program. Problem is, this is only true for single threaded programs, or single threads in multithreaded programs.

    So, for writer_thread and a (volatile) bool

    data.push_back(42);
    data_ready = true;
    

    and

    data_ready = true;
    data.push_back(42);
    

    are equivalent.

    The result is, that

    std::cout << "The answer=" << data[0] << "\n";
    

    can be executed without having pushed any value into data.

    An atomic bool does prevent this kind of optimization, as per definition it may not be reordered. There are flags for atomic operations which allow statements to be moved in front of the operation but not to the back, and vice versa, but those require a really advanced knowledge of your programming structure and the problems it can cause...

    0 讨论(0)
  • 2020-12-30 03:18

    The big difference is that this code is correct, while the version with bool instead of atomic<bool> has undefined behavior.

    These two lines of code create a race condition (formally, a conflict) because they read from and write to the same variable:

    Reader

    while (!data_ready)
    

    And writer

    data_ready = true;
    

    And a race condition on a normal variable causes undefined behavior, according to the C++11 memory model.

    The rules are found in section 1.10 of the Standard, the most relevant being:

    Two actions are potentially concurrent if

    • they are performed by different threads, or
    • they are unsequenced, and at least one is performed by a signal handler.

    The execution of a program contains a data race if it contains two potentially concurrent conflicting actions, at least one of which is not atomic, and neither happens before the other, except for the special case for signal handlers described below. Any such data race results in undefined behavior.

    You can see that whether the variable is atomic<bool> makes a very big difference to this rule.

    0 讨论(0)
  • 2020-12-30 03:23

    A "classic" bool, as you put it, would not work reliably (if at all). One reason for this is that the compiler could (and most likely does, at least with optimizations enabled) load data_ready only once from memory, because there is no indication that it ever changes in the context of reader_thread.

    You could work around this problem by using volatile bool to enforce loading it every time (which would probably seem to work) but this would still be undefined behavior regarding the C++ standard because the access to the variable is neither synchronized nor atomic.

    You could enforce synchronization using the locking facilities from the mutex header, but this would introduce (in your example) unnecessary overhead (hence std::atomic).


    The problem with volatile is that it only guarantees that instructions are not omitted and the instruction ordering is preserved. volatile does not guarantee a memory barrier to enforce cache coherence. What this means is that writer_thread on processor A can write the value to it's cache (and maybe even to the main memory) without reader_thread on processor B seeing it, because the cache of processor B is not consistent with the cache of processor A. For a more thorough explanation see memory barrier and cache coherence on Wikipedia.


    There can be additional problems with more "complex" expressions then x = y (i.e. x += y) that would require synchronization through a lock (or in this simple case an atomic +=) to ensure the value of x does not change during processing.

    x += y for example is actually:

    • read x
    • compute x + y
    • write result back to x

    If a context switch to another thread occurs during the computation this can result in something like this (2 threads, both doing x += 2; assuming x = 0):

    Thread A                 Thread B
    ------------------------ ------------------------
    read x (0)
    compute x (0) + 2
                     <context switch>
                             read x (0)
                             compute x (0) + 2
                             write x (2)
                     <context switch>
    write x (2)
    

    Now x = 2 even though there were two += 2 computations. This effect is known as tearing.

    0 讨论(0)
提交回复
热议问题