“pseudo-atomic” operations in C++

前端 未结 7 1717
北海茫月
北海茫月 2021-02-02 17:44

So I\'m aware that nothing is atomic in C++. But I\'m trying to figure out if there are any \"pseudo-atomic\" assumptions I can make. The reason is that I want to avoid using

7条回答
  •  日久生厌
    2021-02-02 18:42

    Most of the answers correctly address the CPU memory ordering issues you're going to experience, but none have detailed how the compiler can thwart your intentions by re-ordering your code in ways that break your assumptions.

    Consider an example taken from this post:

    volatile int ready;       
    int message[100];      
    
    void foo(int i) 
    {      
        message[i/10] = 42;      
        ready = 1;      
    }
    

    At -O2 and above, recent versions of GCC and Intel C/C++ (don't know about VC++) will do the store to ready first, so it can be overlapped with computation of i/10 (volatile does not save you!):

        leaq    _message(%rip), %rax
        movl    $1, _ready(%rip)      ; <-- whoa Nelly!
        movq    %rsp, %rbp
        sarl    $2, %edx
        subl    %edi, %edx
        movslq  %edx,%rdx
        movl    $42, (%rax,%rdx,4)
    

    This isn't a bug, it's the optimizer exploiting CPU pipelining. If another thread is waiting on ready before accessing the contents of message then you have a nasty and obscure race.

    Employ compiler barriers to ensure your intent is honored. An example that also exploits the relatively strong ordering of x86 are the release/consume wrappers found in Dmitriy Vyukov's Single-Producer Single-Consumer queue posted here:

    // load with 'consume' (data-dependent) memory ordering 
    // NOTE: x86 specific, other platforms may need additional memory barriers
    template 
    T load_consume(T const* addr) 
    {  
      T v = *const_cast(addr); 
      __asm__ __volatile__ ("" ::: "memory"); // compiler barrier 
      return v; 
    } 
    
    // store with 'release' memory ordering 
    // NOTE: x86 specific, other platforms may need additional memory barriers
    template 
    void store_release(T* addr, T v) 
    { 
      __asm__ __volatile__ ("" ::: "memory"); // compiler barrier 
      *const_cast(addr) = v; 
    } 
    

    I suggest that if you are going to venture into the realm of concurrent memory access, use a library that will take care of these details for you. While we all wait for n2145 and std::atomic check out Thread Building Blocks' tbb::atomic or the upcoming boost::atomic.

    Besides correctness, these libraries can simplify your code and clarify your intent:

    // thread 1
    std::atomic foo;  // or tbb::atomic, boost::atomic, etc
    foo.store(1, std::memory_order_release);
    
    // thread 2
    int tmp = foo.load(std::memory_order_acquire);
    

    Using explicit memory ordering, foo's inter-thread relationship is clear.

提交回复
热议问题