How compiler like GCC implement acquire/release semantics for std::mutex

后端 未结 4 1505
慢半拍i
慢半拍i 2021-02-12 09:44

My understanding is that std::mutex lock and unlock have a acquire/release semantics which will prevent instructions between them from being moved outside.

So acquire/re

4条回答
  •  无人共我
    2021-02-12 10:18

    NOTE: I am no expert in this area and my knowledge about it is in a spaghetti like condition. So take the answer with a grain of salt.

    NOTE-2: This might not be the answer that OP is expecting. But here are my 2 cents anyways if it helps:

    My question is that I take a look at GCC5.1 code base and don't see anything special in std::mutex::lock/unlock to prevent compiler reordering codes.

    g++ using pthread library. std::mutex is just a thin wrapper around pthread_mutex. So, you will have to actually go and have a look at pthread's mutex implementation.
    If you go bit deeper into the pthread implementation (which you can find here), you will see that it uses atomic instructions along with futex calls.

    Two minor things to remember here:
    1. The atomic instructions do use barriers.
    2. Any function call is equivalent to full barrier. Do not remember from where I read it.
    3. mutex calls may put the thread to sleep and cause context switch.

    Now, as far as reordering goes, one of the things that needs to be guaranteed is that, no instruction after lock and before unlock should be reordered to before lock or after unlock. This I believe is not a full-barrier, but rather just acquire and release barrier respectively. But, this is again platform dependent, x86 provides sequential consistency by default whereas ARM provides a weaker ordering guarantee.

    I strongly recommend this blog series: http://preshing.com/archives/ It explains lots of lower level stuff in easy to understand language. Guess, I have to read it once again :)

    UPDATE:: Unable to comment on @Cort Ammons answer due to length

    @Kane I am not sure about this, but people in general write barriers for processor level which takes care of compiler level barriers as well. The same is not true for compiler builtin barriers.

    Now, since the pthread_*lock* functions definitions are not present in the translation unit where you are making use of it (this is doubtful), calling lock - unlock should provide you with full memory barrier. The pthread implementation for the platform makes use of atomic instructions to block any other thread from accessing the memory locations after the lock or before unlock. Now since only one thread is executing the critical portion of the code it is ensured that any reordering within that will not change the expected behaviour as mentioned in above comment.

    Atomics is pretty tough to understand and to get right, so, what I have written above is from my understanding. Would be very glad to know if my understanding is wrong here.

提交回复
热议问题