Does the MOV x86 instruction implement a C++11 memory_order_release atomic store?

前端 未结 2 1466
别跟我提以往
别跟我提以往 2021-02-09 01:44

According to this https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html, a released store is implemented as MOV (into memory) on x86 (including x86-64).

Ac

2条回答
  •  臣服心动
    2021-02-09 02:04

    That does appear to be the mapping, at least in code compiled with the Intel compiler, where I see:

    0000000000401100 <_Z5storeRSt6atomicIiE>:
      401100:       48 89 fa                mov    %rdi,%rdx
      401103:       b8 32 00 00 00          mov    $0x32,%eax
      401108:       89 02                   mov    %eax,(%rdx)
      40110a:       c3                      retq
      40110b:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
    
    0000000000401110 <_Z4loadRSt6atomicIiE>:
      401110:       48 89 f8                mov    %rdi,%rax
      401113:       8b 00                   mov    (%rax),%eax
      401115:       c3                      retq
      401116:       0f 1f 00                nopl   (%rax)
      401119:       0f 1f 80 00 00 00 00    nopl   0x0(%rax)
    

    for the code:

    #include 
    #include 
    
    void store( std::atomic & b ) ;
    
    int load( std::atomic & b ) ;
    
    int main()
    {
       std::atomic b ;
    
       store( b ) ;
    
       printf("%d\n", load( b ) ) ;
    
       return 0 ;
    }
    
    void store( std::atomic & b )
    {
       b.store(50, std::memory_order_release ) ;
    }
    
    int load( std::atomic & b )
    {
       int v = b.load( std::memory_order_acquire ) ;
    
       return v ;
    }
    

    The current Intel architecture documents, Volume 3 (System Programming Guide), does a nice job explaining this. See:

    8.2.2 Memory Ordering in P6 and More Recent Processor Families

    • Reads are not reordered with other reads.
    • Writes are not reordered with older reads.
    • Writes to memory are not reordered with other writes, with the following exceptions: ...

    The full memory model is explained there. I'd assume that Intel and the C++ standard folks have worked together in detail to nail down the best mapping for each of the memory order operations possible with that conforms to the memory model described in Volume 3, and plain stores and loads have been determined to be sufficient in those cases.

    Note that just because no special instructions are required for this ordered store on x86-64, doesn't mean that will be universally true. For powerpc I'd expect to see something like a lwsync instruction along with the store, and on hpux (ia64) the compiler should be using a st4.rel instruction.

提交回复
热议问题