“pseudo-atomic” operations in C++

前端未结

关注

 7  1717

北海茫月 2021-02-02 17:44

So I\'m aware that nothing is atomic in C++. But I\'m trying to figure out if there are any \"pseudo-atomic\" assumptions I can make. The reason is that I want to avoid using

7条回答

日久生厌 (楼主)

2021-02-02 18:42
Most of the answers correctly address the CPU memory ordering issues you're going to experience, but none have detailed how the compiler can thwart your intentions by re-ordering your code in ways that break your assumptions.

Consider an example taken from this post:
```
volatile int ready;       
int message[100];      

void foo(int i) 
{      
    message[i/10] = 42;      
    ready = 1;      
}
```
At -O2 and above, recent versions of GCC and Intel C/C++ (don't know about VC++) will do the store to ready first, so it can be overlapped with computation of i/10 (volatile does not save you!):
```
    leaq    _message(%rip), %rax
    movl    $1, _ready(%rip)      ; <-- whoa Nelly!
    movq    %rsp, %rbp
    sarl    $2, %edx
    subl    %edi, %edx
    movslq  %edx,%rdx
    movl    $42, (%rax,%rdx,4)
```
This isn't a bug, it's the optimizer exploiting CPU pipelining. If another thread is waiting on ready before accessing the contents of message then you have a nasty and obscure race.

Employ compiler barriers to ensure your intent is honored. An example that also exploits the relatively strong ordering of x86 are the release/consume wrappers found in Dmitriy Vyukov's Single-Producer Single-Consumer queue posted here:
```
// load with 'consume' (data-dependent) memory ordering 
// NOTE: x86 specific, other platforms may need additional memory barriers
template 
T load_consume(T const* addr) 
{  
  T v = *const_cast(addr); 
  __asm__ __volatile__ ("" ::: "memory"); // compiler barrier 
  return v; 
} 

// store with 'release' memory ordering 
// NOTE: x86 specific, other platforms may need additional memory barriers
template 
void store_release(T* addr, T v) 
{ 
  __asm__ __volatile__ ("" ::: "memory"); // compiler barrier 
  *const_cast(addr) = v; 
} 
```
I suggest that if you are going to venture into the realm of concurrent memory access, use a library that will take care of these details for you. While we all wait for n2145 and std::atomic check out Thread Building Blocks' tbb::atomic or the upcoming boost::atomic.

Besides correctness, these libraries can simplify your code and clarify your intent:
```
// thread 1
std::atomic foo;  // or tbb::atomic, boost::atomic, etc
foo.store(1, std::memory_order_release);

// thread 2
int tmp = foo.load(std::memory_order_acquire);
```
Using explicit memory ordering, foo's inter-thread relationship is clear.
0 讨论(0)

查看其它7个回答
发布评论:

提交评论
- 加载中...