Handling out of order execution

后端 未结 12 2012
忘掉有多难
忘掉有多难 2021-02-02 13:16

I recently stumbled upon this Wikipedia article. From my experience with multi-threading I am aware of the multitude of issues caused by the program being able to switch threads

相关标签:
12条回答
  • 2021-02-02 13:23

    Most compilers nowadays have explicit ordering intrinsics. C++0x has memory ordering intrinsics as well.

    0 讨论(0)
  • 2021-02-02 13:31

    The compiler does not generate out of execution errors, it optimizes and reorders however it likes as long as what it produces yields the result your source code says it should.

    But in dealing with multihreading, this can indeed blow up, though that generally has little to do with how the compiler has reordered your code (though it could make it worse in otherwise optimistic cases.

    Dealing with threads operating on the same data, you need to be very, very careful and make sure your data is properly guarded with the appropriate guarding (semaphores/mutexes/atomic operations and similar)

    0 讨论(0)
  • 2021-02-02 13:33

    Let's be clear - out of order execution refers to the processor execution pipeline not to the compiler per se as your link clearly demonstrates.
    Out of order execution is a strategy employed by most modern CPU pipelines that allows them to re-order instructions on the fly to typically minimise read/write stalls which is the most common bottleneck on modern hardware due to the disparity between CPU execution speeds and memory latency ( i.e how fast my processor can fetch and process compared to how fast I can update the result back to RAM ).
    So this is primarily a hardware feature not a compiler feature.
    You can override this feature if you know what you're doing typically by use of memory barriers. Power PC has a wonderfully named instruction called eieio ( enforce in order execution of i/o) that forces the CPU to flush all pending reads and writes to memory - this is particularly important with concurrent programming ( whether that be multi-threaded or multi-processor ) as it ensures that all CPUs or threads have synchronised the value of all memory locations.
    If you want to read about this in depth then this PDF is an excellent ( though detailed ) introduction.
    HTH

    0 讨论(0)
  • 2021-02-02 13:33

    The compiler and the cpu both implement algorithms which ensure that sequential semantics are preserved for a given execution stream. For them to not implement said algorithms qualifies as a bug. It is safe to assume that instruction reordering will not affect your program semantics.

    As noted elsewhere, memory is the only place where non-sequential semantics may arise; synchronization to sequentialisms can be obtained there via various well-known mechanisms(at the assembly level, there are atomic memory access instructions; higher level functions such as mutexes, barriers, spinlocks, etc. are implemented with atomic assembly instructions).

    In answer to your title: You don't handle OOO execution.

    0 讨论(0)
  • 2021-02-02 13:34

    However, I never knew that compiler and hardware optimisations could reorder operations in a way that is guaranteed to work for a single thread, but not necessarily for multi-threading.

    As neither C nor C++ have had a strongly defined memory model, compilers could reorder optimisations which might cause issues for multi-threading. But as for compilers which are designed for use in multi-threaded environments, they don't.

    Multi-threaded code either writes to memory, and uses a fence to ensure visibility of the writes between threads, or it uses atomic operations.

    Since the values used in the atomic operation case are observable in a single thread, the reordering does not effect it - they have to have been calculated correctly prior to the atomic operation.

    Compliers intended for multi-threaded applications do not reorder across memory fences.

    So the reordering either does not effect the behaviour, or is suppressed as a special case.

    If you are already writing correct multi-threaded code, the compiler reordering doesn't matter. It's only an issue if the compiler isn't aware of memory fences, it which case you probably shouldn't be using it to write multi-threaded code in the first place.

    0 讨论(0)
  • 2021-02-02 13:36

    Let me ask a question: Given a program code (say it is a single-threaded application), what is the correct execution? Intuitively, executing by CPU in-order as code specifies would be correct. This illusion of sequential execution is what programmers have.

    However, modern CPU doesn't obey such restriction. Unless dependences are violated (data dependence, control dependence, and memory dependence), CPUs are executing instructions in out-of-order fashion. However, it is completely hidden to programmers. Programmers can never see what is going on inside of the CPU.

    Compilers also exploit such fact. If the program's semantics (i.e., the inherent dependencies in your code) can be preserved, compilers would reorder any possible instruction to achieve better performance. One notable optimization is code hoisting: compilers may hoist load instruction to minimize memory latency. But, don't worry, compilers guarantee its correctness; In any case, compilers will NOT crash your program due to such instructing reordering since compilers must preserve dependencies at least. (But, compilers might have bugs :-)

    If you're only considering single-thread application, you do not need to worry about such out-of-order execution either by compilers and CPUs, for a single-thread case.

    (To learn more, I recommend you to take a look at the concept of ILP(instruction-level parallelism). Single thread performance is mostly dependent on how much ILP you can extract from a single thread. So, both CPUs and compilers do whatever they can do for better performance.)

    However, when you consider multithreaded execution, then it has a potential problem called memory consistency problem. Intuitively programmers have a concept of sequential consistency. However, modern multi-core architectures are doing dirty and aggressive optimizations (e.g., caches and buffers). It is hard to implement sequential consistency with low-overheads in modern computer architecture. So, there could be very confusing situation due to out-of-order executions of memory loads and stores. You may observe some loads and stores had been executed in out-of-order. Read some articles related to relaxed memory models such as Intel x86's memory model (Read Chapter 8, Memory Ordering, of Volume 3A of Intel 64 and IA-32 Architectures Software Developer’s Manual). Memory barriers are needed in this situation where you have to enforce orders of memory instructions for the correctness.

    THE ANSWER TO THE QUESTION: It's not easy to answer for this question in short. There is no good tools that detects such out-of-order and problematic behaviors due to memory consistency model (though there are research papers). So, in short, it is even hard for you to find such bugs exist in your code. However, I strongly suggest you to read articles on double-checked locking and its detailed paper. In a double-checked locking, due to relaxed memory consistency and compilers' reordering (note that compilers do not aware multi-threaded behaviors unless you explicitly specify with memory barriers), it may lead misbehavior.

    In sum:

    • If you're only working on a single-threaded program, then you don't need to worry about out-of-order behaviors.
    • On multi-core, you may need to consider memory consistency problems. But, it's actually rare when you really need to worry about memory consistency issue. Mostly, data races, deadlocks, and atomicity violations kill your multi-threaded program.
    0 讨论(0)
提交回复
热议问题