问题 I was messing around with optimizing a function using Google Benchmark, and ran into a situation where my code was unexpectedly slowing down in certain situations. I started experimenting with it, looking at the compiled assembly, and eventually came up with a minimal test case that exhibits the issue. Here's the assembly I came up with that exhibits this slowdown: .text test: #xorps %xmm0, %xmm0 cvtsi2ss %edi, %xmm0 addss %xmm0, %xmm0 addss %xmm0, %xmm0 addss %xmm0, %xmm0 addss %xmm0, %xmm0