Why does the BIOS entry point start with a WBINVD instruction?

前端 未结 3 1647
一个人的身影
一个人的身影 2021-02-07 09:03

I\'m investigating the BIOS code in my machine (x86_64 Linux, IvyBridge). I use the following procedure to dump the BIOS code:

$ sudo cat /proc/iomem | grep ROM
         


        
相关标签:
3条回答
  • 2021-02-07 09:21

    This is actually the answer to the title question:

    Hadi Brais: According to slide 14 of BIOS and System Management Mode Internals, the wbinv instruction was there in UDK2010 but then got later removed in UDK2012. Perhaps it's security-related. I don't know exact what.

    I can confirm that this instruction is not present at 0xfffffff0 on my machine from 2017.

    There is a more burning question here and that's what does the comparison with comparison with 0xea mean.

    Here is my code jumped to by the reset vector at 0xfffffff0:

    0x00:  DB E3                      fninit 
    0x02:  0F 6E C0                   movd   mm0, eax   //move BIST value to mm0
    0x05:  0F 31                      rdtsc  
    0x07:  0F 6E EA                   movd   mm5, edx
    0x0a:  0F 6E F0                   movd   mm6, eax  //save tsc
    0x0d:  66 33 C0                   xor    eax, eax //clear eax
    
    0x10:  8E C0                      mov    es, ax
    0x12:  8C C8                      mov    ax, cs
    0x14:  8E D8                      mov    ds, ax
    0x16:  B8 00 F0                   mov    ax, 0xf000
    0x19:  8E C0                      mov    es, ax
    0x1b:  67 26 A0 F0 FF 00 00       mov    al, byte ptr es:[0xfff0]
    0x22:  3C EA                      cmp    al, 0xea
    0x24:  74 0E                      je     0x34   //if ea is at ffff0h then jump to the 0xf000e05b check 
    
    0x26:  BA F9 0C                   mov    dx, 0xcf9
    0x29:  EC                         in     al, dx    //read port 0xcf9
    0x2a:  3C 04                      cmp    al, 4    
    0x2c:  75 25                      jne    0x53      
    0x2e:  BA F9 0C                   mov    dx, 0xcf9 //perform hard reset since if CPU only reset is issued not all MSRs are restored to their defaults
    0x31:  B0 06                      mov    al, 6
    0x33:  EE                         out    dx, al  
    
    0x34:  67 66 26 A1 F1 FF 00 00    mov    eax, dword ptr es:[0xfff1]
    0x3c:  66 3D 5B E0 00 F0          cmp    eax, 0xf000e05b
    0x42:  75 0F                      jne    0x53      //if it isn't, move to notwarmstart; it's not a warm start because BIOS shadow isn't present
    
    0x44:  B9 1B 00                   mov    cx, 0x1b //if it is equal, read bsp bit from apic_base msr
    0x47:  0F 32                      rdmsr  
    0x49:  F6 C4 01                   test   ah, 1
    0x4c:  74 41                      je     0x8f   //if the and operation with 00000001b produces a zero result i.e. it's an AP then jump to cli, hlt
    
    0x4e:  EA F0 FF 00 F0             ljmp   0xf000:0xfff0 //if it's the BSP and the shadow ROM is present, jump to 0xffff0
    
    notwarmstart:
    0x53:  B0 01                      mov    al, 1
    0x55:  E6 80                      out    0x80, al  //send 1 as a debug POST code
    0x57:  66 BE 68 FF FF FF          mov    esi, 0xffffff68
    0x5d:  66 2E 0F 01 14             lgdt   cs:[si] //loads 32&16 GDT pointer (not 16&6, due to 66 prefix) at 16bit address fff68 in si into GDTR (base:ffffff28 limit:003f); will be accessing alias and not shadow ROM
    
    //enter 16 bit protected mode//
    0x62:  0F 20 C0                   mov    eax, cr0
    0x65:  66 83 C8 03                or     eax, 3   //Set PE bit (bit #0) & MP bit (bit #1)
    0x69:  0F 22 C0                   mov    cr0, eax  //Activate protected mode
    0x6c:  0F 20 E0                   mov    eax, cr4 
    0x6f:  66 0D 00 06 00 00          or     eax, 0x600 //Set OSFXSR bit (bit #9) & OSXMMEXCPT bit (bit #10)
    0x75:  0F 22 E0                   mov    cr4, eax
    
    //set up selectors for 32 bit protected mode entry
    0x78:  B8 18 00                   mov    ax, 0x18 //segment descriptor at 0x18 in GDT is (raw): 00cf93000000ffff
    0x7b:  8E D8                      mov    ds, ax
    0x7d:  8E C0                      mov    es, ax
    0x7f:  8E E0                      mov    fs, ax
    0x81:  8E E8                      mov    gs, ax
    0x83:  8E D0                      mov    ss, ax
    0x85:  66 BE 6E FF FF FF          mov    esi, 0xffffff6e
    0x8b:  66 2E FF 2C                ljmp   cs:[si]   //transition to flat 32 bit protected mode and jump to address at 0x0:0xffffff6e aka. 0xffffff6e which is fffffcd8. CS contains 0 remember (it's the base that is 0xffff) so it will load the first entry.
                                                       //PEI begins at that address
    
    0x8f:  FA                         cli    
    0x90:  F4                         hlt    
    .
    .
    

    We notice that my code differs from yours. There is an extra comparison to 0xf000e05b and a read/write to 0xcf9.

    A clue here in the edk2 source code is that the code being jumped to is called 'NotWarmStart'. The code speaks for itself. The key to solving this is by analysing the 3 different implementations carefully (+ your observations from UEFI legacy boot vs UEFI boot).

    In mine, if EA is at FFFF0h then it checks FFFF1h for 0xf000e05b. If 0xf000e05b is there then it checks for the BSP flag, and if its the BSP, it jumps to FFFF0h. If 0xf000e05b isn't there, it jumps to the 16 bit + 32 bit protected mode setup (called 'NotWarmStart), which then jumps to 32 bit flat protected mode (edk2 calls this PEI, but I'd say PEI classically begins at the PEI core and that the code it jumps to is actually still SEC, given that it uses FSP to set up CAR, optionally perform microcode updates if BootGuard isn't present and passes control to the PEI core) implementation at 0x18:0xffffff6e. If EA is not present, it checks bit 3 of 0xcf9 for 'Check INIT# is asserted'. If it is asserted then it performs a hard reset, writing 0x6 which results in a PLTRST#, reason 'issue warm start, since if CPU only reset is issued not all MSRs are restored to their defaults'. If it isn't asserted then it jumps to 'NotWarmStart'.

    There are 2 suggestions at play for the reason 0xffff0 is proven to contain a different value to 0xfffffff0 at reset. 1) RAM contains data and PAMs are steering the 0xfffff range to RAM rather than SPI ROM. RAM would only contain data if some kind of soft reset occurred, like INIT#, where RAM is unaffected. 2) UEFI legacy boot causes Intel ME to set PAMs to default to RAM rather than SPI ROM / disables BIOS decode enable bit BIOS_LEGACY_F_EN on LPC or SPI Bridge (which seems a bit unlikely and elaborate to me, and I feel like the default values will hold true at the reset vector).

    At runtime, your dump shows identical code at 0xffff0 and 0xfffffff0 for UEFI boot but different code for UEFI legacy boot. It looks to me like in UEFI mode, there is no shadow ROM in RAM at 0xffff0. You're probably directly accessing the SPI ROM because there's no reason for that range to be touched (legacy option ROMs aren't required and I've got legacy option ROMs shadowed in my UEFI legacy boot system. In UEFI mode, there will be DXE drivers present in the XROMBAR space that will be used instead).

    Just looking at your code it's easy to say: the check for 0xea is saying 'if 0xea isn't there then it's a UEFI boot, so jump to 32 bit SEC and determine whether warm or not later'. 'if 0xea is there then it's a warm start and the previous boot was a legacy boot, so jump to the shorthand implementation at 0xffff0'.

    The problem is, my code reveals the 3rd option, and it has to be there for a reason. 0xffff0 can be in 3 different states. Not containing 0xea (jumps to 32 bit SEC); containing 0xea and 0xf000e05b (if BSP, jumps to 0xffff0 otherwise hlt); containing 0xea and not 0xf000e05b (jumps to 32 bit SEC).

    My guess is that containing 0xea and 0xf000e05b means it is a legacy boot and a warm start. Containing 0xea and not 0xf000e05b means it is a warm UEFI boot. Not containing 0xea means the RAM contains nothing useful in either mode and if it's actually a warm boot then it needs issue a PLTRST# if the RAM doesn't contain anything useful, . That's sort of the only option remaining. That leaves me to theorise that seeing as that 3rd check doesn't occur on your UEFI BIOS, you see identical code in UEFI mode, whereas if I were to boot into UEFI mode, I reckon I'd see different code at 0xffff0 to 0xfffffff0, but a different code to what it would be if I were in UEFI legacy boot. This is possibly some 16 bit shorthand for UEFI warm boot in shadowed RAM, which is still present after a warm boot, and UEFI will detect and jump to it / use this data later on. On your system, this shadow RAM at the location is not being used and is being directed to SPI ROM instead. Maybe yours implements it differently and shadows to a different region of the 1MiB space and uses a different PAM, and it detects it later on (and therefore doesn't need to clarify the 0xea with an extra step); it may assume UEFI shadow in the 700MiB range is corrupt (because the OS could overwrite it but some of it remains resident; I'm not sure what the policy is on this). The 1MiB range may be the only safe place to shadow warm start data and it can't shadow to 0xff000000–0xffffffff as that range can only ever be decoded to DMI and in the RAM is typically memory reclaim from elsewhere. If is assumes the OS doesn't overwrite the UEFI data in RAM, then your shadow might not be in the lower 1MiB at all, and the check further in may be checking the 700MiB region for the warm start implementation. The warm start implementation will assume services are loaded and devices are already enumerated and will let you select a new boot device if you want.

    The reason why edk2 calls the routine 'NotWarmStart' even though it doesn't check RAM / support warm start like our implementations, is because I'd imagine that 0xcf9 tells the processor if a warm boot / soft reset has occurred on the system (I.e. an INIT# packet has been sent to the processor: bit3 is high but bit2 is low, and the code currently executing is implicitly on the processor that was INITed; I can only assume this bit goes low after a reset only by using PLTRST# or writing 0 to it), therefore it can still tell that it is a warm start but it needs to (whether the RAM contains useful data or not) perform a PLTRST#, because the warm start system state will never be made use of.

    Also there is no loop at hlt. Hlt enters a HALT state, and responds to an INIT# IPI to put it in a wait-for-SIPI state. Execution will then begin at whatever address the BSP selects for the AP.

    0 讨论(0)
  • 2021-02-07 09:25

    According to this Dr. Dobb's article written by Pete Dice of Intel back in 2011:

    Because the processor cache is not enabled by default, it is not uncommon to flush cache in this step with a WBINV instruction. The WBINV is not needed on newer processors, but it doesn't hurt anything.

    I'm not sure what WBINVD has to do with cache not being enabled by default. I thought he might have meant that WBINVD would enable the cache but the documentation doesn't say anything about the instruction having this effect. I think the second sentence confirms Margaret's suspicion that this is a case of cargo cult.

    0 讨论(0)
  • 2021-02-07 09:34

    Albeit hard to reason about, remember that the load mov al, byte es:[0xfff0] is not reading from the the BIOS first instruction, even though es is set to 0xf000.

    The first instruction is read from 0xfffffff0, the PCH will also probably alias 0xf0000-0xfffff to 0xffff0000-0xffffffff at reset, so when the BSP is booted it will execute the code you dumped.
    IIRC, the APs don't boot unless explicitly waken up.

    The BSP will then will proceed with initialising the HW (judging from the dump).
    At some point it will set the attribute map for the 0xf0000-0xfffff to steer reads and writes (or just writes and then reads) to memory.
    The end result is that when a processor (an HW thread) boots it will execute the code from the flash until it perform a far jump.
    At the point the cs base is correctly computed as per real-mode rules (pretty much like the unreal mode) and the instruction will be fetched from the 0xf0000-0xfffff (i.e. from the RAM).
    All of this while the cs segment value didn't actually change.

    The BSP at some point will start its multiprocessor initialisation routine, where it broadcasts to everyone (including himself) an INIT-SIPI-SIPI that will result in a sleep for the APs and a ljmp 0xf000:0xfff0 for the BSP.
    The trick here is that the target of the jump, 0xf000:0xfff0, is not the same bus address of the wbinvd instruction.
    There could be something else there, probably another initialisation routine.

    At the end of the initialisation the BIOS could simply reset the attributes of the 0xf0000-0xfffff to fall through to the flash (so a software reset is possible), preventing (not intentionally) a dump of the intermediary code.

    This is not very efficient, but BIOSes are not usually masterpieces of code.

    I don't have enough element to be sure what's going on, my point is that the ljmp 0xf000:0xfff0 and the mov al, byte es:[0xfff0] doesn't have to read from the same region they reside in.
    With this in mind, all bets are off.
    Only a proper reverse engineering will tell.

    Regarding the wbinvd, I suggested in the comment it could be related to the warm boot facility and Peter Cordes suggested that it may specifically have to do with cache-as-RAM.
    It makes sense, I guess will never be sure though.
    It could as well be a case of cargo cult, where a programmer deemed the instruction necessary based rumors.

    0 讨论(0)
提交回复
热议问题