Is std::mutex sequentially consistent?

问题

Say, I have two threads A and B writing to a global Boolean variables fA and fB respectively which are initially set to false and are protected by std::mutex objects mA and mB respectively:

// Thread A
mA.lock();
assert( fA == false );
fA = true;
mA.unlock();

// Thread B
mB.lock()
assert( fB == false );
fB = true;
mB.unlock()

Is it possible to observe the modifications on fA and fB in different orders in different threads C and D? In other words, can the following program

#include <atomic>
#include <cassert>
#include <iostream>
#include <mutex>
#include <thread>
using namespace std;

mutex mA, mB, coutMutex;
bool fA = false, fB = false;

int main()
{
    thread A{ []{
            lock_guard<mutex> lock{mA};
            fA = true;
        } };
    thread B{ [] {
            lock_guard<mutex> lock{mB};
            fB = true;
        } };
    thread C{ [] { // reads fA, then fB
            mA.lock();
            const auto _1 = fA;
            mA.unlock();
            mB.lock();
            const auto _2 = fB;
            mB.unlock();
            lock_guard<mutex> lock{coutMutex};
            cout << "Thread C: fA = " << _1 << ", fB = " << _2 << endl;
        } };
    thread D{ [] { // reads fB, then fA (i. e. vice versa)
            mB.lock();
            const auto _3 = fB;
            mB.unlock();
            mA.lock();
            const auto _4 = fA;
            mA.unlock();
            lock_guard<mutex> lock{coutMutex};
            cout << "Thread D: fA = " << _4 << ", fB = " << _3 << endl;
        } };
    A.join(); B.join(); C.join(); D.join();
}

legally print

Thread C: fA = 1, fB = 0
Thread D: fA = 0, fB = 1

according to the C++ standard?

Note: A spin-lock can be implemented using std::atomic<bool> variables using either sequential consistent memory order or acquire/release memory order. So the question is whether an std::mutex behaves like a sequentially consistent spin-lock or an acquire/release memory order spin-lock.

回答1:

~~Yes, that is allowed~~ That output isn't possible, but std::mutex is not necessarily sequentially consistent. Acquire/release is enough to rule out that behaviour.

std::mutex is not defined in the standard to be sequentially consistent, only that

30.4.1.2 Mutex types [thread.mutex.requirements.mutex]

11 Synchronization: Prior unlock() operations on the same object shall synchronize with (1.10) this operation [lock()].

Synchronize-with seems to be defined in the same was as std::memory_order::release/acquire (see this question).
As far as I can see, an acquire/release spinlock would satisfy the standards for std::mutex.

Big edit:

However, I don't think that means what you think (or what I thought). The output is still not possible, since acquire/release semantics are enough to rule it out. This is a kind of subtle point that is better explained here. It seems obviously impossible at first but I think it's right to be cautious with stuff like this.

From the standard, unlock() synchronises with lock(). That means anything that happens before unlock() is visible after lock(). Happens before (henceforth ->) is a slightly weird relation explained better in the above link, but because there's mutexes around everything in this example, everything works like you expect, i.e. const auto _1 = fA; happens before const auto _2 = fB;, and any changes visible to a thread when it unlock()s the mutex are visible to the next thread that lock()s the mutex. Also it has some expected properties, e.g. if X happens before Y and Y happens before Z, then X -> Z, also if X happens before Y then Y doesn't happen before X.

From here it's not hard to see the contradiction that seems intuitively correct.

In short, there's a well defined order of operations for each mutex - e.g. for mutex A, threads A, C, D hold the locks in some sequence. For thread D to print fA=0, it must lock mA before thread A, vice versa for thread C. So the lock sequence for mA is D(mA) -> A(mA) -> C(mA).

For mutex B the sequence must be C(mB) -> B(mB) -> D(mB).

But from the program we know C(mA) -> C(mB), so that lets us put both together to get D(mA) -> A(mA) -> C(mA) -> C(mB) -> B(mB) -> D(mB), which means D(mA) -> D(mB). But the code also gives us D(mB) -> D(mA), which is a contradiction, meaning your observed output is not possible.

This outcome is no different for an acquire/release spinlock, I think everyone was confusing regular acquire/release memory access on a variable with access to a variable protected by a spinlock. The difference is that with a spinlock, the reading threads also perform a compare/exchange and a release write, which is a completely different scenario to a single release write and acquire read.

If you used a sequentially consistent spinlock then this wouldn't affect the output. The only difference is that you could always categorically answer questions like "mutex A was locked before mutex B" from a separate thread that didn't acquire either lock. But for this example and most others, that kind of statement isn't useful, hence acquire/release being the standard.

来源：https://stackoverflow.com/questions/41847511/is-stdmutex-sequentially-consistent

标签

c++

multithreading

c++11

mutex

memory-barriers