问题
When running a boot-loader program on a modern-day x86 processor, the processor will be running in real-address mode. Will its instruction pipelining features be active in real mode, or not?
回答1:
Yes, the out-of-order core in modern microarchitectures operates basically the same regardless of mode. Most of the difference is in the decoders. See Agner Fog's microarch pdf and other links in the x86 tag wiki for details of how modern CPUs actually do work internally.
It would probably take extra silicon to behave differently in 16bit mode, since it's very similar to 32bit mode with paging disabled, but with a different default address-size and operand-size.
I've read that AMD CPUs are slightly slower when segments have a non-zero base. (Or I guess in 16bit mode: when segment registers themselves are set to non-zero values, since in 16bit mode they're used directly, rather than being selectors for descriptors.)
Keep in mind that many common 16bit idioms like loop are terrible.
Also, partial-register slowdowns can easily interfere with out-of-order execution if you aren't careful. Intel P6-family and SnB-family CPUs rename partial registers separately, so writing to AX doesn't have a false dependency on the full contents of EAX/RAX. There can be stalls when merging later on CPUs before SnB, or just minor slowdowns on SnB before Haswell.
All other microarchitectures treat mov ax, 5
as a read-modify-write of eax
, so it doesn't break the dependency chain on the old value of ax
. This can be a huge problem for out-of-order execution if you aren't careful.
Read Agner Fog's manuals to learn more.
16bit addressing modes might not perform well, I forget. 32bit code doesn't need them to be fast, and 64bit code can't use 16bit addresses at all. (The address-size prefix in 64bit code means address-size = 32bits.)
VEX-coded instructions (including BMI2 integer instructions like pext) aren't available in real mode. This Intel forum topic suggests that may be due to existing software (NTVDM) using the machine code as a trap to protected mode. (i.e. the same illegal operands to LDS/LES that VEX uses). Making VEX-coded instructions still generate #UD
is thus important for backwards compatibility.
SSE is still available in real mode, though, if you enable it with the right CR setting.
(VEX/EVEX are available in 16-bit protected mode, but not real or virtual-8086 mode. Is x86 32-bit assembly code valid x86 64-bit assembly code?)
来源:https://stackoverflow.com/questions/37829075/is-pipelining-oooe-available-on-modern-x86-processors-when-running-in-real-mode