How do “acquire” and “consume” memory orders differ, and when is “consume” preferable?

后端 未结 4 1028
情书的邮戳
情书的邮戳 2021-01-29 19:40

The C++11 standard defines a memory model (1.7, 1.10) which contains memory orderings, which are, roughly, \"sequentially-consistent\", \"acquire\", \"consume\", \"rele

4条回答
  •  一向
    一向 (楼主)
    2021-01-29 20:05

    I'd like to record a partial finding, even though it's not a real answer and doesn't mean that there won't be a big bounty for a proper answer.

    After staring at 1.10 for a while, and in particular the very helpful note in paragraph 11, I think this isn't actually so hard. The big difference between synchronizes-with (henceforth: s/w) and dependency-ordered-before (dob) is that a happens-before relationship can be established by concatenating sequenced-before (s/b) and s/w arbitrarily, but not so for dob. Note one of the definitions for inter-thread happens before:

    A synchronizes-with X and X is sequenced before B

    But the analogous statement for A is dependency-ordered before X is missing!

    So with release/acquire (i.e. s/w) we can order arbitrary events:

    A1    s/b    B1                                            Thread 1
                       s/w
                              C1    s/b    D1                  Thread 2
    

    But now consider an arbitrary sequence of events like this:

    A2    s/b    B2                                            Thread 1
                       dob
                              C2    s/b    D2                  Thread 2
    

    In this sequenece, it is still true that A2 happens-before C2 (because A2 is s/b B2 and B2 inter-thread happens before C2 on account of dob; but we could argue that you can never actually tell!). However, it is not true that A2 happens-before D2. The events A2 and D2 are not ordered with respect to one another, unless it actually holds that C2 carries dependency to D2. This is a stricter requirement, and absent that requirement, A2-to-D2 cannot be ordered "across" the release/consume pair.

    In other words, a release/consume pair only propagates an ordering of actions which carry a dependency from one to the next. Everything that's not dependent is not ordered across the release/consume pair.

    Furthermore, note that the ordering is restored if we append a final, stronger release/acquire pair:

    A2    s/b    B2                                                         Th 1
                       dob
                              C2    s/b    D2                               Th 2
                                                 s/w
                                                        E2    s/b    F2     Th 3
    

    Now, by the quoted rule, D2 inter-thread happens before F2, and therefore so do C2 and B2, and so A2 happens-before F2. But note that there is still no ordering between A2 and D2 — the ordering is only between A2 and later events.

    In summary and in closing, dependency carrying is a strict subset of general sequencing, and release/consume pairs provide an ordering only among actions that carry dependency. As long as no stronger ordering is required (e.g. by passing through a release/acquire pair), there is theoretically a potential for additional optimization, since everything that is not in the dependency chain may be reordered freely.


    Maybe here is an example that makes sense?

    std::atomic foo(0);
    
    int x = 0;
    
    void thread1()
    {
        x = 51;
        foo.store(10, std::memory_order_release);
    }
    
    void thread2()
    {
        if (foo.load(std::memory_order_acquire) == 10)
        {
            assert(x == 51);
        }
    }
    

    As written, the code is race-free and the assertion will hold, because the release/acquire pair orderes the store x = 51 before the load in the assertion. However, by changing "acquire" into "consume", this would no longer be true and the program would have a data race on x, since x = 51 carries no dependency into the store to foo. The optimization point is that this store can be reordered freely without concern to what foo is doing, because there is no dependency.

提交回复
热议问题