To measure the impact of cache-misses in a program, I want to latency caused by cache-misses to the cycles used for actual computation. I use perf stat to measu
perf stat