问题
I recently learned about the row hammer attack. In order to perform this attack the programmer needs to flush the complete cache hierarchy of a CPU for a specific number of addresses.
My question is: why is CLFLUSH necessary in x86? What are the reasons for ever using this instruction, if all L* caches act transparently (i.e., no explicit cache invalidation needed)? Besides that: isn't the CPU free to speculate memory access patterns, and thereby ignore the instruction altogether?
回答1:
I think the main use-case is Non-volatile DIMMs, especially Intel's Optane DC PM. It's normally mapped WB-cacheable so requires explicit flushes (or movnt
) to make sure data is persisted to non-volatile storage.
Skylake introduced weakly-ordered higher performance CLFLUSHOPT because it's useful for non-volatile storage hooked up to the memory hierarchy directly. Flushing cache makes sure data is written out to actual memory, not still dirty in the CPU.
See also this SuperUser answer for some links and background on Optane DC PM (Persistent Memory). It's non-volatile storage in physical address-space, not just in virtual address space with software tricks.
Dan Luu's article on clwb and pcommit is interesting: the benefits of taking the OS out of the way for access to storage, detailing Intel's plans at that point for clflush / clwb and their memory-ordering semantics. It was written while Intel was still planning to require an instruction called pcommit
(persistent commit) as part of this process, but Intel later decided to remove that instruction: Deprecating the PCOMMIT Instruction (from Intel) has some interesting info about why, and how things work under the hood.
It potentially also matters for non-cache-coherent DMA to devices, if anything can still do that in x86. (Probably not; I think all DMA is cache-coherent now.)
来源:https://stackoverflow.com/questions/39336536/why-does-clflush-exist-in-x86