micro-optimization

Does a Length-Changing Prefix (LCP) incur a stall on a simple x86_64 instruction?

浪尽此生 提交于 2021-01-20 04:48:03
问题 Consider a simple instruction like mov RCX, RDI # 48 89 f9 The 48 is the REX prefix for x86_64. It is not an LCP. But consider adding an LCP (for alignment purposes): .byte 0x67 mov RCX, RDI # 67 48 89 f9 67 is an address size prefix which in this case is for an instruction without addresses. This instruction also has no immediates, and it doesn't use the F7 opcode (False LCP stalls; F7 would be TEST, NOT, NEG, MUL, IMUL, DIV + IDIV). Assume that it doesn't cross a 16-byte boundary either.

Does a Length-Changing Prefix (LCP) incur a stall on a simple x86_64 instruction?

落爺英雄遲暮 提交于 2021-01-20 04:47:11
问题 Consider a simple instruction like mov RCX, RDI # 48 89 f9 The 48 is the REX prefix for x86_64. It is not an LCP. But consider adding an LCP (for alignment purposes): .byte 0x67 mov RCX, RDI # 67 48 89 f9 67 is an address size prefix which in this case is for an instruction without addresses. This instruction also has no immediates, and it doesn't use the F7 opcode (False LCP stalls; F7 would be TEST, NOT, NEG, MUL, IMUL, DIV + IDIV). Assume that it doesn't cross a 16-byte boundary either.