Assembly - x86 call instruction and memory address?

前端 未结 1 1775
灰色年华
灰色年华 2021-01-12 15:05

I\'ve been reading some assembly code and I\'ve started seeing that call instructions are actually program counter relative.

However, whenever I\'m using visual stud

1条回答
  •  清酒与你
    2021-01-12 16:00

    If you're disassembling .o object files that haven't been linked yet, the call address will just be a placeholder to be filled in by the linker.

    You can use objdump -drwc -Mintel to show the relocation types + symbol names from a .o (The -r option is the key. Or -R for an already-linked shared library.)


    It's more useful to the user to show the actual address of the jump target, rather than disassemble it as jcc eip-1234H or something. Object files have a default load address, so the disassembler has a value for eip at every instruction, and this is usually present in disassembly output.

    e.g. in some asm code I wrote (where I use symbol names that made it into the object file, so the loop branch target is actually visible to the disassembler):

    objdump -M intel  -d rs-asmbench:
    ...
    00000000004020a0 <.loop>:
      4020a0:       0f b6 c2                movzx  eax,dl
      4020a3:       0f b6 de                movzx  ebx,dh
       ...
      402166:       49 83 c3 10             add    r11,0x10
      40216a:       0f 85 30 ff ff ff       jne    4020a0 <.loop>
    
    0000000000402170 <.last8>:
      402170:       0f b6 c2                movzx  eax,dl
    

    Note that the encoding of the jne instruction is a signed little-endian 32bit displacement, of -0xD0 bytes. (jumps add their displacement to the value of e/rip after the jump. The jump instruction itself is 6 bytes long, so the displacement has to be -0xD0, not just -0xCA.) 0x100 - 0xD0 = 0x30, which is the value of the least-significant byte of the 2's complement displacement.

    In your question, you're talking about the call addresses being 0xFFFF..., which makes little sense unless that's just a placeholder, or you thought the non-0xFF bytes in the displacement were part of the opcode.

    Before linking, references to external symbols look like this:

    objdump -M intel -d main.o
      ...
      a5:   31 f6                   xor    esi,esi
      a7:   e8 00 00 00 00          call   ac 
      ac:   4c 63 e0                movsxd r12,eax
      af:   ba 00 00 00 00          mov    edx,0x0
      b4:   48 89 de                mov    rsi,rbx
      b7:   44 89 f7                mov    edi,r14d
      ba:   e8 00 00 00 00          call   bf 
      bf:   83 f8 ff                cmp    eax,0xffffffff
      c2:   75 cc                   jne    90 
      ...
    

    Notice how the call instructions have their relative displacement = 0. So before the linker has slotted in the actual relative value, they encode a call with a target of the instruction right after the call. (i.e. RIP = RIP+0). The call bf is immediately followed by an instruction that starts at 0xbf from the start of the section. The other call has a different target address because it's at a different place in the file. (gcc puts main in its own section: .text.startup).

    So, if you want to make sense of what's actually being called, look at a linked executable, or get a disassembler that has looks at the object file symbols to slot in symbolic names for call targets instead of showing them as calls with zero displacement.

    Relative jumps to local symbols already get resolved before linking:

    objdump -Mintel  -d asm-pinsrw.o:
    0000000000000040 <.loop>:
      40:   0f b6 c2                movzx  eax,dl
      43:   0f b6 de                movzx  ebx,dh
      ...
     106:   49 83 c3 10             add    r11,0x10
     10a:   0f 85 30 ff ff ff       jne    40 <.loop>
    0000000000000110 <.last8>:
     110:   0f b6 c2                movzx  eax,dl
    

    Note, the exact same instruction encoding on the relative jump to a symbol in the same file, even though the file has no base address, so the disassembler just treats it as zero.

    See Intel's reference manual for instruction encoding. Links at https://stackoverflow.com/tags/x86/info. Even in 64bit mode, call only supports 32bit sign-extended relative offsets. 64bit addresses are supported as absolute. (In 32bit mode, 16bit relative addresses are supported, with an operand-size prefix, I guess saving one instruction byte.)

    0 讨论(0)
提交回复
热议问题