how is a memory barrier in linux kernel is used

后端 未结 2 1135
太阳男子
太阳男子 2021-02-09 09:16

There is an illustration in kernel source Documentation/memory-barriers.txt, like this:

    CPU 1                   CPU 2
    ======================         


        
相关标签:
2条回答
  • 2021-02-09 09:51

    From the section of the document titled "WHAT MAY NOT BE ASSUMED ABOUT MEMORY BARRIERS?":

    There is no guarantee that any of the memory accesses specified before a memory barrier will be complete by the completion of a memory barrier instruction; the barrier can be considered to draw a line in that CPU's access queue that accesses of the appropriate type may not cross.

    and

    There is no guarantee that a CPU will see the correct order of effects from a second CPU's accesses, even if the second CPU uses a memory barrier, unless the first CPU also uses a matching memory barrier (see the subsection on "SMP Barrier Pairing").

    What memory barriers do (in a very simplified way, of course) is make sure neither the compiler nor in-CPU hardware perform any clever attempts at reordering load (or store) operations across a barrier, and that the CPU correctly perceives changes to the memory made by other parts of the system. This is necessary when the loads (or stores) carry additional meaning, like locking a lock before accessing whatever it is we're locking. In this case, letting the compiler/CPU make the accesses more efficient by reordering them is hazardous to the correct operation of our program.

    When reading this document we need to keep two things in mind:

    1. That a load means transmitting a value from memory (or cache) to a CPU register.
    2. That unless the CPUs share the cache (or have no cache at all), it is possible for their cache systems to be momentarily our of sync.

    Fact #2 is one of the reasons why one CPU can perceive the data differently from another. While cache systems are designed to provide good performance and coherence in the general case, but might need some help in specific cases like the ones illustrated in the document.

    In general, like the document suggests, barriers in systems involving more than one CPU should be paired to force the system to synchronize the perception of both (or all participating) CPUs. Picture a situation in which one CPU completes loads or stores and the main memory is updated, but the new data had yet to be transmitted to the second CPU's cache, resulting in a lack of coherence across both CPUs.

    I hope this helps. I'd suggest reading memory-barriers.txt again with this in mind and particularly the section titled "THE EFFECTS OF THE CPU CACHE".

    0 讨论(0)
  • 2021-02-09 09:57

    The key missing point is the mistaken assumption that for the sequence:

    LOAD C (gets &B)
    LOAD *C (reads B)
    

    the first load has to precede the second load. A weakly ordered architectures can act "as if" the following happened:

    LOAD B (reads B)  
    LOAD C (reads &B)
    if( C!=&B ) 
        LOAD *C
    else
        Congratulate self on having already loaded *C
    

    The speculative "LOAD B" can happen, for example, because B was on the same cache line as some other variable of earlier interest or hardware prefetching grabbed it.

    0 讨论(0)
提交回复
热议问题