GCC reordering up across load with `memory_order_seq_cst`. Is this allowed?

前端 未结 2 1212
刺人心
刺人心 2021-02-14 07:32

Using a simplified version of a basic seqlock , gcc reorders a nonatomic load up across an atomic load(memory_order_seq_cst) when compiling the code with -O3

2条回答
  •  无人共我
    2021-02-14 07:55

    Congratulations, I think you've hit a bug in gcc!

    Now I think you can make a reasonable argument, as the other answer does, that the original code you showed could perhaps have been correctly optimized that way by gcc by relying on a fairly obscure argument about the unconditional access to value: essentially you can't have been relying on a synchronizes-with relationship between the load seq0 = seq_.load(); and the subsequent read of value, so reading it "somewhere else" shouldn't change the semantics of a race-free program. I'm not actually sure of this argument, but here's a "simpler" case I got from reducing your code:

    #include 
    #include 
    
    std::atomic seq_;
    std::size_t value;
    
    auto load()
    {
        std::size_t copy;
        std::size_t seq0;
        do
        {
            seq0 = seq_.load();
            if (!seq0) continue;
            copy = value;
            seq0 = seq_.load();
        } while (!seq0);
    
        return copy;
    }
    

    This isn't a seqlock or anything - it just waits for seq0 to change from zero to non-zero, and then reads value. The second read of seq_ is superfluous as is the while condition, but without them the bug goes away.

    This is now the read-side of the well known idiom which does work and is race-free: one thread writes to value, then sets seq0 non-zero with a release store. The threads calling load see the non-zero store, and synchronize with it, and so can safely read value. Of course, you can't keep writing to value, it's a "one time" initialization, but this a common pattern.

    With the above code, gcc is still hoisting the read of value:

    load():
            mov     rax, QWORD PTR value[rip]
    .L2:
            mov     rdx, QWORD PTR seq_[rip]
            test    rdx, rdx
            je      .L2
            mov     rdx, QWORD PTR seq_[rip]
            test    rdx, rdx
            je      .L2
            rep ret
    

    Oops!

    This behavior occurs up to gcc 7.3, but not in 8.1. Your code also compiles as you wanted in 8.1:

        mov     rbx, QWORD PTR seq_[rip]
        mov     rbp, QWORD PTR value[rip]
        mov     rax, QWORD PTR seq_[rip]
    

提交回复
热议问题