Why an ELF executable could have 4 LOAD segments?

后端 未结 1 765
逝去的感伤
逝去的感伤 2020-12-03 23:41

There is a remote 64-bit *nix server that can compile a user-provided code (which should be written in Rust, but I don\'t think it matters since it uses LLVM). I don\'t know

相关标签:
1条回答
  • 2020-12-04 00:19

    A typical BFD-ld or Gold linked Linux executable has 2 loadable segments, with the ELF header merged with .text and .rodata into the first RE segment, and .data, .bss and other writable sections merged into the second RW segment.

    Here is the typical section to segment mapping:

    $ echo "int foo; int main() { return 0;}"  | clang -xc - -o a.out-gold -fuse-ld=gold
    $ readelf -Wl a.out-gold
    
    Elf file type is EXEC (Executable file)
    Entry point 0x400420
    There are 9 program headers, starting at offset 64
    
    Program Headers:
      Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
      PHDR           0x000040 0x0000000000400040 0x0000000000400040 0x0001f8 0x0001f8 R   0x8
      INTERP         0x000238 0x0000000000400238 0x0000000000400238 0x00001c 0x00001c R   0x1
          [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
      LOAD           0x000000 0x0000000000400000 0x0000000000400000 0x0006b0 0x0006b0 R E 0x1000
      LOAD           0x000e18 0x0000000000401e18 0x0000000000401e18 0x0001f8 0x000200 RW  0x1000
      DYNAMIC        0x000e28 0x0000000000401e28 0x0000000000401e28 0x0001b0 0x0001b0 RW  0x8
      NOTE           0x000254 0x0000000000400254 0x0000000000400254 0x000020 0x000020 R   0x4
      GNU_EH_FRAME   0x00067c 0x000000000040067c 0x000000000040067c 0x000034 0x000034 R   0x4
      GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0x10
      GNU_RELRO      0x000e18 0x0000000000401e18 0x0000000000401e18 0x0001e8 0x0001e8 RW  0x8
    
     Section to Segment mapping:
      Segment Sections...
       00
       01     .interp
       02     .interp .note.ABI-tag .dynsym .dynstr .gnu.hash .hash .gnu.version .gnu.version_r .rela.dyn .init .text .fini .rodata .eh_frame .eh_frame_hdr
       03     .fini_array .init_array .dynamic .got .got.plt .data .bss
       04     .dynamic
       05     .note.ABI-tag
       06     .eh_frame_hdr
       07
       08     .fini_array .init_array .dynamic .got .got.plt
    

    This optimizes the number of mmaps that the kernel must perform to load such executable, but at a security cost: the data in .rodata shouldn't be executable, but is (because it's merged with .text, which must be executable). This may significantly increase the attack surface for someone trying to hijack a process.

    Newer Linux systems, in particular using LLD to link binaries, prioritize security over speed, and put ELF header and .rodata into the first R-only segment, resulting in 3 load segments and improved security. Here is a typical mapping:

    $ echo "int foo; int main() { return 0;}"  | clang -xc - -o a.out-lld -fuse-ld=lld
    $ readelf -Wl a.out-lld
    
    Elf file type is EXEC (Executable file)
    Entry point 0x201000
    There are 10 program headers, starting at offset 64
    
    Program Headers:
      Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
      PHDR           0x000040 0x0000000000200040 0x0000000000200040 0x000230 0x000230 R   0x8
      INTERP         0x000270 0x0000000000200270 0x0000000000200270 0x00001c 0x00001c R   0x1
          [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
      LOAD           0x000000 0x0000000000200000 0x0000000000200000 0x000558 0x000558 R   0x1000
      LOAD           0x001000 0x0000000000201000 0x0000000000201000 0x000185 0x000185 R E 0x1000
      LOAD           0x002000 0x0000000000202000 0x0000000000202000 0x001170 0x002005 RW  0x1000
      DYNAMIC        0x003010 0x0000000000203010 0x0000000000203010 0x000150 0x000150 RW  0x8
      GNU_RELRO      0x003000 0x0000000000203000 0x0000000000203000 0x000170 0x001000 R   0x1
      GNU_EH_FRAME   0x000440 0x0000000000200440 0x0000000000200440 0x000034 0x000034 R   0x1
      GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0
      NOTE           0x00028c 0x000000000020028c 0x000000000020028c 0x000020 0x000020 R   0x4
    
     Section to Segment mapping:
      Segment Sections...
       00
       01     .interp
       02     .interp .note.ABI-tag .rodata .dynsym .gnu.version .gnu.version_r .gnu.hash .hash .dynstr .rela.dyn .eh_frame_hdr .eh_frame
       03     .text .init .fini
       04     .data .tm_clone_table .fini_array .init_array .dynamic .got .bss
       05     .dynamic
       06     .fini_array .init_array .dynamic .got
       07     .eh_frame_hdr
       08
       09     .note.ABI-tag
    

    Not to be left behind, the newer BFD-ld (my version is 2.31.1) also makes ELF header and .rodata read-only, but fails to merge two R-only segments into one, resulting in 4 loadable segments:

    $ echo "int foo; int main() { return 0;}"  | clang -xc - -o a.out-bfd -fuse-ld=bfd
    $ readelf -Wl a.out-bfd
    
    Elf file type is EXEC (Executable file)
    Entry point 0x401020
    There are 11 program headers, starting at offset 64
    
    Program Headers:
      Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
      PHDR           0x000040 0x0000000000400040 0x0000000000400040 0x000268 0x000268 R   0x8
      INTERP         0x0002a8 0x00000000004002a8 0x00000000004002a8 0x00001c 0x00001c R   0x1
          [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
      LOAD           0x000000 0x0000000000400000 0x0000000000400000 0x0003f8 0x0003f8 R   0x1000
      LOAD           0x001000 0x0000000000401000 0x0000000000401000 0x00018d 0x00018d R E 0x1000
      LOAD           0x002000 0x0000000000402000 0x0000000000402000 0x000110 0x000110 R   0x1000
      LOAD           0x002e40 0x0000000000403e40 0x0000000000403e40 0x0001e8 0x0001f0 RW  0x1000
      DYNAMIC        0x002e50 0x0000000000403e50 0x0000000000403e50 0x0001a0 0x0001a0 RW  0x8
      NOTE           0x0002c4 0x00000000004002c4 0x00000000004002c4 0x000020 0x000020 R   0x4
      GNU_EH_FRAME   0x002004 0x0000000000402004 0x0000000000402004 0x000034 0x000034 R   0x4
      GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0x10
      GNU_RELRO      0x002e40 0x0000000000403e40 0x0000000000403e40 0x0001c0 0x0001c0 R   0x1
    
     Section to Segment mapping:
      Segment Sections...
       00
       01     .interp
       02     .interp .note.ABI-tag .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn
       03     .init .text .fini
       04     .rodata .eh_frame_hdr .eh_frame
       05     .init_array .fini_array .dynamic .got .got.plt .data .bss
       06     .dynamic
       07     .note.ABI-tag
       08     .eh_frame_hdr
       09
       10     .init_array .fini_array .dynamic .got
    

    Finally, some of these choices are affected by the --(no)rosegment linker option.

    0 讨论(0)
提交回复
热议问题