How do debug symbols affect performance of a Linux executable compiled by GCC?

前端 未结 2 1605
感动是毒
感动是毒 2020-12-15 17:06

All other factors being equal (eg optimisation level), how does having debug symbols in an ELF or SO affect:

  1. Load time.
  2. Runtime memory footprint.
相关标签:
2条回答
  • 2020-12-15 17:08

    The debug symbols are located in totally different sections from the code/data sections. You can check it with objdump:

    $ objdump -h a.out
    
    a.out:     file format elf64-x86-64
    
    Sections:
    Idx Name          Size      VMA               LMA               File off  Algn
      0 .interp       0000001c  0000000000400200  0000000000400200  00000200  2**0
                      CONTENTS, ALLOC, LOAD, READONLY, DATA
      1 .note.ABI-tag 00000020  000000000040021c  000000000040021c  0000021c  2**2
                      CONTENTS, ALLOC, LOAD, READONLY, DATA
      2 .note.gnu.build-id 00000024  000000000040023c  000000000040023c  0000023c  2**2
                      CONTENTS, ALLOC, LOAD, READONLY, DATA
      3 .hash         00000018  0000000000400260  0000000000400260  00000260  2**3
                      CONTENTS, ALLOC, LOAD, READONLY, DATA
      4 .gnu.hash     0000001c  0000000000400278  0000000000400278  00000278  2**3
                      CONTENTS, ALLOC, LOAD, READONLY, DATA
      5 .dynsym       00000048  0000000000400298  0000000000400298  00000298  2**3
                      CONTENTS, ALLOC, LOAD, READONLY, DATA
      6 .dynstr       00000038  00000000004002e0  00000000004002e0  000002e0  2**0
                      CONTENTS, ALLOC, LOAD, READONLY, DATA
      7 .gnu.version  00000006  0000000000400318  0000000000400318  00000318  2**1
                      CONTENTS, ALLOC, LOAD, READONLY, DATA
      8 .gnu.version_r 00000020  0000000000400320  0000000000400320  00000320  2**3
                      CONTENTS, ALLOC, LOAD, READONLY, DATA
      9 .rela.dyn     00000018  0000000000400340  0000000000400340  00000340  2**3
                      CONTENTS, ALLOC, LOAD, READONLY, DATA
     10 .rela.plt     00000018  0000000000400358  0000000000400358  00000358  2**3
                      CONTENTS, ALLOC, LOAD, READONLY, DATA
     11 .init         00000018  0000000000400370  0000000000400370  00000370  2**2
                      CONTENTS, ALLOC, LOAD, READONLY, CODE
     12 .plt          00000020  0000000000400388  0000000000400388  00000388  2**2
                      CONTENTS, ALLOC, LOAD, READONLY, CODE
     13 .text         000001c8  00000000004003b0  00000000004003b0  000003b0  2**4
                      CONTENTS, ALLOC, LOAD, READONLY, CODE
     14 .fini         0000000e  0000000000400578  0000000000400578  00000578  2**2
                      CONTENTS, ALLOC, LOAD, READONLY, CODE
     15 .rodata       00000004  0000000000400588  0000000000400588  00000588  2**2
                      CONTENTS, ALLOC, LOAD, READONLY, DATA
     16 .eh_frame_hdr 00000024  000000000040058c  000000000040058c  0000058c  2**2
                      CONTENTS, ALLOC, LOAD, READONLY, DATA
     17 .eh_frame     0000007c  00000000004005b0  00000000004005b0  000005b0  2**3
                      CONTENTS, ALLOC, LOAD, READONLY, DATA
     18 .ctors        00000010  0000000000600630  0000000000600630  00000630  2**3
                      CONTENTS, ALLOC, LOAD, DATA
     19 .dtors        00000010  0000000000600640  0000000000600640  00000640  2**3
                      CONTENTS, ALLOC, LOAD, DATA
     20 .jcr          00000008  0000000000600650  0000000000600650  00000650  2**3
                      CONTENTS, ALLOC, LOAD, DATA
     21 .dynamic      000001a0  0000000000600658  0000000000600658  00000658  2**3
                      CONTENTS, ALLOC, LOAD, DATA
     22 .got          00000008  00000000006007f8  00000000006007f8  000007f8  2**3
                      CONTENTS, ALLOC, LOAD, DATA
     23 .got.plt      00000020  0000000000600800  0000000000600800  00000800  2**3
                      CONTENTS, ALLOC, LOAD, DATA
     24 .data         00000010  0000000000600820  0000000000600820  00000820  2**3
                      CONTENTS, ALLOC, LOAD, DATA
     25 .bss          00000010  0000000000600830  0000000000600830  00000830  2**3
                      ALLOC
     26 .comment      00000039  0000000000000000  0000000000000000  00000830  2**0
                      CONTENTS, READONLY
     27 .debug_aranges 00000030  0000000000000000  0000000000000000  00000869  2**0
                      CONTENTS, READONLY, DEBUGGING
     28 .debug_pubnames 0000001b  0000000000000000  0000000000000000  00000899  2**0
                      CONTENTS, READONLY, DEBUGGING
     29 .debug_info   00000055  0000000000000000  0000000000000000  000008b4  2**0
                      CONTENTS, READONLY, DEBUGGING
     30 .debug_abbrev 00000034  0000000000000000  0000000000000000  00000909  2**0
                      CONTENTS, READONLY, DEBUGGING
     31 .debug_line   0000003b  0000000000000000  0000000000000000  0000093d  2**0
                      CONTENTS, READONLY, DEBUGGING
     32 .debug_str    00000026  0000000000000000  0000000000000000  00000978  2**0
                      CONTENTS, READONLY, DEBUGGING
     33 .debug_loc    0000004c  0000000000000000  0000000000000000  0000099e  2**0
                      CONTENTS, READONLY, DEBUGGING
    

    You can see the extra sections (27 through 33). These sections won't be loaded at runtime, so there won't be any performance penalty. Using gdb, you can also examine them at runtime

    $ gdb ./a.out
    (gdb) break main
    (gdb) run
    (gdb) info files
    // blah blah ....
    Local exec file:
            `/home/kghost/a.out', file type elf64-x86-64.
            Entry point: 0x4003b0
            0x0000000000400200 - 0x000000000040021c is .interp
            0x000000000040021c - 0x000000000040023c is .note.ABI-tag
            0x000000000040023c - 0x0000000000400260 is .note.gnu.build-id
            0x0000000000400260 - 0x0000000000400278 is .hash
            0x0000000000400278 - 0x0000000000400294 is .gnu.hash
            0x0000000000400298 - 0x00000000004002e0 is .dynsym
            0x00000000004002e0 - 0x0000000000400318 is .dynstr
            0x0000000000400318 - 0x000000000040031e is .gnu.version
            0x0000000000400320 - 0x0000000000400340 is .gnu.version_r
            0x0000000000400340 - 0x0000000000400358 is .rela.dyn
            0x0000000000400358 - 0x0000000000400370 is .rela.plt
            0x0000000000400370 - 0x0000000000400388 is .init
            0x0000000000400388 - 0x00000000004003a8 is .plt
            0x00000000004003b0 - 0x0000000000400578 is .text
            0x0000000000400578 - 0x0000000000400586 is .fini
            0x0000000000400588 - 0x000000000040058c is .rodata
            0x000000000040058c - 0x00000000004005b0 is .eh_frame_hdr
            0x00000000004005b0 - 0x000000000040062c is .eh_frame
            0x0000000000600630 - 0x0000000000600640 is .ctors
            0x0000000000600640 - 0x0000000000600650 is .dtors
            0x0000000000600650 - 0x0000000000600658 is .jcr
            0x0000000000600658 - 0x00000000006007f8 is .dynamic
            0x00000000006007f8 - 0x0000000000600800 is .got
            0x0000000000600800 - 0x0000000000600820 is .got.plt
            0x0000000000600820 - 0x0000000000600830 is .data
            0x0000000000600830 - 0x0000000000600840 is .bss
    // blah blah ....
    

    So the only penalty is that you need extra disk space to store this information. You can also use strip to remove the debug information:

    $ strip a.out
    

    Use objdump to check it again, you'll see the difference.

    EDIT:

    Instead looking sections, actually the loader loads elf file according to its Program Header, which can be seen by objdump -p. (following example is using a different elf binary)

    $ objdump -p /bin/cat
    
    /bin/cat:     file format elf64-x86-64
    
    Program Header:
        PHDR off    0x0000000000000040 vaddr 0x0000000000000040 paddr 0x0000000000000040 align 2**3
             filesz 0x00000000000001f8 memsz 0x00000000000001f8 flags r-x
      INTERP off    0x0000000000000238 vaddr 0x0000000000000238 paddr 0x0000000000000238 align 2**0
             filesz 0x000000000000001c memsz 0x000000000000001c flags r--
        LOAD off    0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**21
             filesz 0x00000000000078bc memsz 0x00000000000078bc flags r-x
        LOAD off    0x0000000000007c28 vaddr 0x0000000000207c28 paddr 0x0000000000207c28 align 2**21
             filesz 0x0000000000000678 memsz 0x0000000000000818 flags rw-
     DYNAMIC off    0x0000000000007dd8 vaddr 0x0000000000207dd8 paddr 0x0000000000207dd8 align 2**3
             filesz 0x00000000000001e0 memsz 0x00000000000001e0 flags rw-
        NOTE off    0x0000000000000254 vaddr 0x0000000000000254 paddr 0x0000000000000254 align 2**2
             filesz 0x0000000000000044 memsz 0x0000000000000044 flags r--
    EH_FRAME off    0x0000000000006980 vaddr 0x0000000000006980 paddr 0x0000000000006980 align 2**2
             filesz 0x0000000000000274 memsz 0x0000000000000274 flags r--
       STACK off    0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**4
             filesz 0x0000000000000000 memsz 0x0000000000000000 flags rw-
       RELRO off    0x0000000000007c28 vaddr 0x0000000000207c28 paddr 0x0000000000207c28 align 2**0
             filesz 0x00000000000003d8 memsz 0x00000000000003d8 flags r--
    

    The program headers tell which segment will be loaded with what rwx flags, multiple sections with same flags will be merged to a single segment.

    BTW:

    The loader doesn't care sections when loading elf file, but it will look several symbol related sections to resolve symbols when needed.

    0 讨论(0)
  • 2020-12-15 17:13

    You might want to look at Why does my code run slower with multiple threads than with a single thread when it is compiled for profiling (-pg)? for a quick explanations of how the debug symbols could affect optimization.

    To answer your 3 questions:

    1. Load time will be increased when the debug symbols are present over when not present
    2. The on-disk footprint will be larger
    3. If you compiled with zero optimization then you really lose nothing. If you set optimization, then the optimized code will be less optimized because of the debug symbols.
    0 讨论(0)
提交回复
热议问题