Concurrency and memory models

后端 未结 4 1714
广开言路
广开言路 2021-02-15 14:52

I\'m watching this video by Herb Sutter on GPGPU and the new C++ AMP library. He is talking about memory models and mentions Weak Memory Models and then

4条回答
  •  深忆病人
    2021-02-15 15:25

    In terms of concurrency, a memory model specifies the constraints on data accesses, and the conditions under which data written by one thread/core/processor becomes visible to another.

    The terms weak and strong are somewhat ambiguous, but the basic premise is that a strong memory model places a lot of constraints on the hardware to ensure that writes by one thread/core/processor are visible to other threads/cores/processors in clearly-defined orders, whilst allowing the programmer maximum freedom of data access.

    On the other hand, a weak model places very little constraints on the hardware, but instead places the responsibility of ensuring visibility in the hands of the programmer.

    The strongest memory model is Sequential Consistency: all operations to all data by all processors form a single total order agreed on by all processors, which is consistent with the order of operations on each processor individually. This is essentially an interleaving of the operations of each processor.

    The weakest memory model will not impose any restrictions on the order that processors see each other's writes. Different processors in the same system may see writes in different orders, and some processors may use "stale" data from their own cache for a long time after a write to the same memory address by another processor. Sometimes, whole cache lines are treated as a single unit, so a write to one variable on a cache line will cause writes from other processors to other variables on that cache line that are not yet visible to the first processor to be effectively discarded, as the stale values are written over the top when it eventually writes the cache line to memory. Under such a scheme, extreme care must be taken to ensure that data is transferred to other processors in the correct order, using explicit synchronization instructions.

    For example, the Intel x86 memory model is generally considered to be on the stronger end, as there are strict rules about the order in which writes become visible to other processors, whereas the DEC Alpha and ARM processors are generally considered to have weak memory models, as writes from one processor are only required to be visible to other processors in a particular order if you explicitly put ordering instructions (memory fences or barriers) in your code.

    Some systems have memory that is only accessible by particular processors. Transferring data between these processors therefore requires explicit data transfer instructions. This is the case with the Cell processors, and is often the case with GPUs as well. This can be viewed as an extreme of a weak memory model --- data is only visible to other processors if you explicitly invoke the data transfer.

    Programming languages usually impose their own memory models on top of whatever is provided by the underlying processors. For example, C++0x specifies a complete set of ordering constraints ranging from completely relaxed to full sequential consistency, so you can specify in code what you require. On the other hand, Java has a very specific set of ordering constraints that must be adhered to and cannot be varied. In both cases the compiler must translate the desired constraints into the relevant instructions for the underlying processor, which may be quite involved if you request sequential consistency on a weakly ordered machine.

提交回复
热议问题