Variance in RDTSC overhead
问题 I'm constructing a micro-benchmark to measure performance changes as I experiment with the use of SIMD instruction intrinsics in some primitive image processing operations. However, writing useful micro-benchmarks is difficult, so I'd like to first understand (and if possible eliminate) as many sources of variation and error as possible. One factor that I have to account for is the overhead of the measurement code itself. I'm measuring with RDTSC, and I'm using the following code to find the