For x86-64 architecture, is there an instruction that can load data at a given memory address to the cache? Similarly, is there an instruction that can evict a cache line gi
prefetch data into cache (without loading it into a register):
PREFETCHT0 [address]
PREFETCHT1 [address]
PREFETCHT2 [address]
intrinsic: void _mm_prefetch (char const* p, int hint)
See the insn ref manual and other guides for what the different nearness hints mean. (Other links at the x86 tag wiki).
The famous What Every Programmer Should Know About Memory article was written when P4 was current. Current CPUs have smarter hardware prefetchers, and hyperthreading is useful for much more than just running prefetch threads. Prefetch threads are a dead idea. Other than that, excellent article about caching; I wrote an SO answer with a modern review of what's changed and what's still relevant in Ulrich's original. Search for other SO posts and stuff to decide when to actually prefetch.
Do not overdo it with software prefetch on Intel IvyBridge. That specific microarchitecture has a performance bug, and can only retire one prefetch per 43 clocks.
Flush the cache line containing a given address:
clflush [address]
clflushopt [address] ; Newer CPUs only. Weakly ordered, with higher throughput.
intrinsic: void _mm_clflushopt (void const * p)
There was a recent question about its performance.