The Intel 64 and IA-32 Architectures Software Developer\'s Manual says the following about re-ordering of actions by a single processor (Section 8.2.2, \"Memory Ordering in
8.2.3.5 "Intra-Processor Forwarding Is Allowed" explains an example of store-buffer forwarding:
Initially x = y = 0
Processor 0 Processor 1 ============== ============= mov [x], 1 mov [y], 1 mov r1, [x] mov r3, [y] mov r2, [y] mov r4, [x]
The result
r2 == 0
andr4 == 0
is allowed.... the reordering in this example can arise as a result of store-buffer forwarding. While a store is temporarily held in a processor's store buffer, it can satisfy the processor's own loads but is not visible to (and cannot satisfy) loads by other processors.
The statement that says reads can't be reordered with writes to the same location ("Reads may be reordered with older writes to different locations but not with older writes to the same location") is in a section that applies to "a single-processor system for memory regions defined as write-back cacheable". The "store-buffer forwarding" behavior applies to multi-processor behavior only.