VA (Virtual Address) & RVA (Relative Virtual Address)

前端 未结 2 1216
我在风中等你
我在风中等你 2020-11-28 01:39

A file that is given as input to the linker is called Object File. The linker produces an Image file, which in turn is us

相关标签:
2条回答
  • 2020-11-28 02:07

    Most Windows process (*.exe) are loaded in (user mode) memory address 0x00400000, that's what we call the "virtual address" (VA) - because they are visible only to each process, and will be converted to different physical addresses by the OS (visible by the kernel / driver layer).

    For example, a possible physical memory address (visible by the CPU):

    0x00300000 on physical memory has process A's main
    0x00500000 on physical memory has process B's main
    

    And the OS may have a mapping table:

    process A's 0x00400000 (VA) = physical address 0x00300000
    process B's 0x00400000 (VA) = physical address 0x00500000
    

    Then when you try to read 0x004000000 in process A, you'll get the content which is located on 0x00300000 of physical memory.

    Regarding RVA, it's simply designed to ease relocation. When loading relocable modules (eg, DLL) the system will try to slide it through process memory space. So in file layout it puts a "relative" address to help calculation.

    For example, a DLL C may have this address:

     RVA 0x00001000 DLL C's main entry
    

    When being loaded into process A at base address 0x10000000, C's main entry become

     VA = 0x10000000 + 0x00001000 = 0x10001000
     (if process A's VA 0x10000000 mapped to physical address was 0x30000000, then 
      C's main entry will be 0x30001000 for physical address).
    

    When being loaded into process B at base address 0x32000000, C's main entry become

     VA = 0x32000000 + 0x00001000 = 0x32001000
     (if process B's VA 0x32000000 mapped to physical address was 0x50000000, then 
      C's main entry will be 0x50001000 for physical address).
    

    Usually the RVA in image files is relative to process base address when being loaded into memory, but some RVA may be relative to the "section" starting address in image or object files (you have to check the PE format spec for detail). No matter which, RVA is relative to "some" base VA.

    To summarize,

    1. Physical Memory Address is what CPU sees
    2. Virtual Addreess (VA) is relative to Physical Address, per process (managed by OS)
    3. RVA is relative to VA (file base or section base), per file (managed by linker and loader)

    (edit) regarding claw's new question:

    The value of RVA of a method/variable is NOT always its offset from the beginning of the file. They are usually relative to some VA, which may be a default loading base address or section base VA - that's why I say you must check the PE format spec for detail.

    Your tool, PEView is trying to display every byte's RVA to load base address. Since the sections start at different base, RVA may become different when crossing sections.

    Regarding your guesses, they are very close to the correct answers:

    1. Usually we won't discuss the "RVA" before sections, but the PE header will still be loaded until the end of section headers. Gap between section header and section body (if any) won't be loaded. You can examine that by debuggers. Moreoever, when there's some gap between sections, they may be not loaded.

    2. As I said, RVA is simply "relative to some VA", no matter what VA it is (although when talking about PE, the VA usually refers to the load base address). When you read thet PE format spec you may find some "RVA" which is relative to some special address like resource starting address. The PEView list RVA from 0x1000 is because that section starts at 0x1000. Why 0x1000? Because the linker left 0x1000 bytes for PE header, so the RVA starts at 0x1000.

    3. What you've missed is the concept of "section" in PE loading stage. The PE may contain several "sections", each section maps to a new starting VA address. For example, this is dumped from win7 kernel32.dll:

      #  Name   VirtSize RVA      PhysSize Offset
      1 .text   000C44C1 00001000 000C4600 00000800
      2 .data   00000FEC 000C6000 00000E00 000C4E00
      3 .rsrc   00000520 000C7000 00000600 000C5C00
      4 .reloc  0000B098 000C8000 0000B200 000C6200
      

      There is an invisible "0 header RVA=0000, SIZE=1000" which forced .text to start at RVA 1000. The sections should be continuous when being loaded into memory (i.e., VA) so their RVA is continuous. However since the memory is allocated by pages, it'll be multiple of page size (4096=0x1000 bytes). That's why #2 section starts at 1000 + C5000 = C6000 (C5000 comes from C44C1).

      In order to provide memory mapping, these sections must still be aligned by some size (file alignment size - decide by linker. In my example above it's 0x200=512 bytes), which controls the PhysSize field. Offset means "offset to physical PE file beginning".

      So the headers occupy 0x800 bytes of file (and 0x1000 when being mapped to memory), which is the offset of section #1. Then by aligning its data (c44c1 bytes), we get physsize C4600. C4600+800 = C4E00, which is exactly the offset of second section.

      OK, this is related to whole PE loading stuff so it may be a little hard to understand...

    (edit) let me make a new simple summary again.

    1. The "RVA" in DLL/EXE (PE Format) files are usually relative to the "load base address in memory" (but not always - you must read the spec)
    2. The PE Format contains a "section" mapping structure to map the physical file content into memory. So the RVA is not really relative to the file offset.
    3. To calculate a RVA of some byte, you have to find its offset in the section and add the section base.
    0 讨论(0)
  • 2020-11-28 02:18

    A relative virtual address is an offset from the address at which the file is loaded. Probably the simplest way to get the idea is with an example. Assume you have a file (e.g., a DLL) that's loaded at address 1000h. In that file, you have a variable at RVA 200h. In that case, the VA of that variable (after the DLL is mapped to memory) is 1200h (i.e. the 1000h base address of the DLL plus the 200h RVA (offset) to the variable.

    0 讨论(0)
提交回复
热议问题