gcc debug symbols (-g flag) vs linker's -rdynamic option

前端 未结 1 523
感动是毒
感动是毒 2020-12-12 13:38

glibc provides backtrace() and backtrace_symbols() to get the stack trace of a running program. But for this to work the program has to be built wi

1条回答
  •  醉梦人生
    2020-12-12 14:15

    According to the docs:

    This instructs the linker to add all symbols, not only used ones, to the dynamic symbol table.

    Those are not debug symbols, they are dynamic linker symbols. Those are not removed by strip since it would (in most cases) break the executable - they are used by the runtime linker to do the final link stage of your executable.

    Example:

    $ cat t.c
    void foo() {}
    int main() { foo(); return 0; }
    

    Compile and link without -rdynamic (and no optimizations, obviously)

    $ gcc -O0 -o t t.c
    $ readelf -s t
    
    Symbol table '.dynsym' contains 3 entries:
       Num:    Value          Size Type    Bind   Vis      Ndx Name
         0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
         1: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __libc_start_main@GLIBC_2.2.5 (2)
         2: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND __gmon_start__
    
    Symbol table '.symtab' contains 50 entries:
       Num:    Value          Size Type    Bind   Vis      Ndx Name
         0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
         1: 0000000000400270     0 SECTION LOCAL  DEFAULT    1 
    ....
        27: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS t.c
        28: 0000000000600e14     0 NOTYPE  LOCAL  DEFAULT   18 __init_array_end
        29: 0000000000600e40     0 OBJECT  LOCAL  DEFAULT   21 _DYNAMIC
    

    So the executable has a .symtab with everything. But notice that .dynsym doesn't mention foo at all - it has the bare essentials in there. This is not enough information for backtrace_symbols to work. It relies on the information present in that section to match code addresses with function names.

    Now compile with -rdynamic:

    $ gcc -O0 -o t t.c -rdynamic
    $ readelf -s t
    
    Symbol table '.dynsym' contains 17 entries:
       Num:    Value          Size Type    Bind   Vis      Ndx Name
         0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
         1: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __libc_start_main@GLIBC_2.2.5 (2)
         2: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND __gmon_start__
         3: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND _Jv_RegisterClasses
         4: 0000000000601018     0 NOTYPE  GLOBAL DEFAULT  ABS _edata
         5: 0000000000601008     0 NOTYPE  GLOBAL DEFAULT   24 __data_start
         6: 0000000000400734     6 FUNC    GLOBAL DEFAULT   13 foo
         7: 0000000000601028     0 NOTYPE  GLOBAL DEFAULT  ABS _end
         8: 0000000000601008     0 NOTYPE  WEAK   DEFAULT   24 data_start
         9: 0000000000400838     4 OBJECT  GLOBAL DEFAULT   15 _IO_stdin_used
        10: 0000000000400750   136 FUNC    GLOBAL DEFAULT   13 __libc_csu_init
        11: 0000000000400650     0 FUNC    GLOBAL DEFAULT   13 _start
        12: 0000000000601018     0 NOTYPE  GLOBAL DEFAULT  ABS __bss_start
        13: 000000000040073a    16 FUNC    GLOBAL DEFAULT   13 main
        14: 0000000000400618     0 FUNC    GLOBAL DEFAULT   11 _init
        15: 00000000004007e0     2 FUNC    GLOBAL DEFAULT   13 __libc_csu_fini
        16: 0000000000400828     0 FUNC    GLOBAL DEFAULT   14 _fini
    
    Symbol table '.symtab' contains 50 entries:
       Num:    Value          Size Type    Bind   Vis      Ndx Name
         0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
         1: 0000000000400270     0 SECTION LOCAL  DEFAULT    1 
    ....
        27: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS t.c
        28: 0000000000600e14     0 NOTYPE  LOCAL  DEFAULT   18 __init_array_end
        29: 0000000000600e40     0 OBJECT  LOCAL  DEFAULT   21 _DYNAMIC
    

    Same thing for symbols in .symtab, but now foo has a symbol in the dynamic symbol section (and a bunch of other symbols appear there now too). This makes backtrace_symbols work - it now has enough information (in most cases) to map code addresses with function names.

    Strip that:

    $ strip --strip-all t
    $ readelf -s t
    
    Symbol table '.dynsym' contains 17 entries:
       Num:    Value          Size Type    Bind   Vis      Ndx Name
         0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
         1: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __libc_start_main@GLIBC_2.2.5 (2)
         2: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND __gmon_start__
         3: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND _Jv_RegisterClasses
         4: 0000000000601018     0 NOTYPE  GLOBAL DEFAULT  ABS _edata
         5: 0000000000601008     0 NOTYPE  GLOBAL DEFAULT   24 __data_start
         6: 0000000000400734     6 FUNC    GLOBAL DEFAULT   13 foo
         7: 0000000000601028     0 NOTYPE  GLOBAL DEFAULT  ABS _end
         8: 0000000000601008     0 NOTYPE  WEAK   DEFAULT   24 data_start
         9: 0000000000400838     4 OBJECT  GLOBAL DEFAULT   15 _IO_stdin_used
        10: 0000000000400750   136 FUNC    GLOBAL DEFAULT   13 __libc_csu_init
        11: 0000000000400650     0 FUNC    GLOBAL DEFAULT   13 _start
        12: 0000000000601018     0 NOTYPE  GLOBAL DEFAULT  ABS __bss_start
        13: 000000000040073a    16 FUNC    GLOBAL DEFAULT   13 main
        14: 0000000000400618     0 FUNC    GLOBAL DEFAULT   11 _init
        15: 00000000004007e0     2 FUNC    GLOBAL DEFAULT   13 __libc_csu_fini
        16: 0000000000400828     0 FUNC    GLOBAL DEFAULT   14 _fini
    $ ./t
    $
    

    Now .symtab is gone, but the dynamic symbol table is still there, and the executable runs. So backtrace_symbols still works too.

    Strip the dynamic symbol table:

    $ strip -R .dynsym t
    $ ./t
    ./t: relocation error: ./t: symbol , version GLIBC_2.2.5 not defined in file libc.so.6 with link time reference
    

    ... and you get a broken executable.

    An interesting read for what .symtab and .dynsym are used for is here: Inside ELF Symbol Tables. One of the things to note is that .symtab is not needed at runtime, so it is discarded by the loader. That section does not remain in the process's memory. .dynsym, on the otherhand, is needed at runtime, so it is kept in the process image. So it is available for things like backtrace_symbols to gather information about the current process from within itself.

    So in short:

    • dynamic symbols are not stripped by strip since that would render the executable non-loadable
    • backtrace_symbols needs dynamic symbols to figure out what code belongs which function
    • backtrace_symbols does not use debugging symbols

    Hence the behavior you noticed.


    For your specific questions:

    1. gdb is a debugger. It uses debug information in the executable and libraries to display relevant information. It is much more complex than backtrace_symbols, and inspects the actual files on your drive in addition to the live process. backtrace_symbols does not, it is entirely in-process - so it cannot access sections that are not loaded into the executable image. Debug sections are not loaded into the runtime image, so it can't use them.
    2. .dynsym is not a debugging section. It is a section used by the dynamic linker. .symbtab isn't a debugging section either, but it can be used by debugger that have access to the executable (and library) files. -rdynamic does not generate debug sections, only that extended dynamic symbol table. The executable growth from -rdynamic depends entirely on the number of symbols in that executable (and alignment/padding considerations). It should be considerably less than -g.
    3. Except for statically linked binaries, executables need external dependencies resolved at load time. Like linking printf and some application startup procedures from the C library. These external symbols must be indicated somewhere in the executable: this is what .dynsym is used for, and this is why the exe has a .dynsym even if you don't specify -rdynamic. When you do specify it, the linker adds other symbols that are not necessary for the process to work, but can be used by things like backtrace_symbols.
    4. backtrace_symbols will not resolve any function names if you statically link. Even if you specify -rdynamic, the .dynsym section will not be emitted to the executable. No symbol tables gets loaded into the executable image, so backtrace_symbols cannot map code adresses to symbols.

    0 讨论(0)
提交回复
热议问题