Estimating Cycles Per Instruction
问题 I have disassembled a small C++ program compiled with MSVC v140 and am trying to estimate the cycles per instruction in order to better understand how code design impacts performance. I've been following Mike Acton's CppCon 2014 talk on "Data-Oriented Design and C++", specifically the portion I've linked to. In it, he points out these lines: movss 8(%rbx), %xmm1 movss 12(%rbx), %xmm0 He then claims that these 2 x 32-bit reads are probably on the same cache line therefore cost roughly ~200