where/how does Apples GCC store DWARF inside an executable

和自甴很熟 提交于 2019-11-27 19:11:19
Jason Molenda

On Mac OS X there was a decision to have the linker (ld) not process all of the debug information when you link your program. The debug information is often 10x the size of the program executable so having the linker process all of the debug info and include it in the executable binary was a serious detriment to link times. For iterative development - compile, link, compile, link, debug, compile link - this was a real hit.

Instead the compiler generates the DWARF debug information in the .s files, the assembler outputs it in the .o files, and the linker includes a "debug map" in the executable binary which tells debug info users where all of the symbols were relocated during the link.

A consumer (doing .o-file debugging) loads the debug map from the executable and processes all of the DWARF in the .o files as-needed, remapping the symbols as per the debug map's instructions.

dsymutil can be thought of as a debug info linker. It does this same process -- read the debug map, load the DWARF from the .o files, relocate all the addresses -- and then outputs a single binary of all the DWARF at their final, linked addresses. This is the dSYM bundle.

Once you have a dSYM bundle, you've got plain old standard DWARF that any dwarf reading tool (which can deal with Mach-O binary files) can process.

There is an additional refinement that makes all of this work, the UUIDs included in Mach-O binaries. Every time the linker creates a binary, it emits a 128-bit UUID in the LC_UUID load command (v. otool -hlv or dwarfdump --uuid). This uniquely identifies that binary file. When dsymutil creates the dSYM, it includes that UUID. The debuggers will only associate a dSYM and an executable if they have matching UUIDs -- no dodgy file mod timestamps or anything like that.

We can also use the UUIDs to locate the dSYMs for binaries. They show up in crash reports, we include a Spotlight importer that you can use to search for them, e.g. mdfind "com_apple_xcode_dsym_uuids == E21A4165-29D5-35DC-D08D-368476F85EE1" if the dSYM is located in a Spotlight indexed location. You can even have a repository of dSYMs for your company and a program that can retrieve the correct dSYM given a UUID - maybe a little mysql database or something like that - so you run the debugger on a random executable and you instantly have all the debug info for that executable. There are some pretty neat things you can do with the UUIDs.

But anyway, to answer your origin question: The unstripped binary has the debug map, the .o files have the DWARF, and when dsymutil is run these are combined to create the dSYM bundle.

If you want to see the debug map entries, do nm -pa executable and they're all there. They are in the form of the old stabs nlist records - the linker already knew how to process stabs so it was easiest to use them - but you'll see how it works without much trouble, maybe refer to some stabs documentation if you're uncertain.

It seems it actually doesn't.

I traced dsymutil and it reads all the *.o files. objdump -h also lists all the debug info in them.

So it seems that those info isn't copied over to the binary.


Some related comments about this can also be found here.

It seems there are two ways for OSX to place debug information:

  1. In the .o object files used for compilation. The binary stores a reference to these files (by absolute path).

  2. In a separate bundle (directory) called .dSYM

If I compile with Apple's Clang using g++ -g main.cpp -o foo I get the bundle called foo.dSYM. However if I use CMake I get the debug info in the object files. I guess because it does a separate gcc -c main.cpp -o main.o step?

Anyway I found this command very useful for case 1:

$ dsymutil -dump-debug-map main
---
triple:          'x86_64-apple-darwin'
binary-path:     main
objects:         
  - filename:        /Users/tim/foo/build/CMakeFiles/main.dir/main.cpp.o
    timestamp:       1485951213
    symbols:         
      - { sym: __ZNSt3__111char_traitsIcE11eq_int_typeEii, objAddr: 0x0000000000000D50, binAddr: 0x0000000100001C90, size: 0x00000020 }
      - { sym: __ZNSt3__111char_traitsIcE6lengthEPKc, objAddr: 0x0000000000000660, binAddr: 0x00000001000015A0, size: 0x00000020 }
      - { sym: GCC_except_table3, objAddr: 0x0000000000000DBC, binAddr: 0x0000000100001E2C, size: 0x00000000 }
      - { sym: _main, objAddr: 0x0000000000000000, binAddr: 0x0000000100000F40, size: 0x00000090 }
      - { sym: __ZNSt3__124__put_character_sequenceIcNS_11char_traitsIcEEEERNS_13basic_ostreamIT_T0_EES7_PKS4_m, objAddr: 0x00000000000001F0, binAddr: 0x0000000100001130, size: 0x00000470 }
      - { sym: ___clang_call_terminate, objAddr: 0x0000000000000D40, binAddr: 0x0000000100001C80, size: 0x00000010 }
      - { sym: GCC_except_table5, objAddr: 0x0000000000000E6C, binAddr: 0x0000000100001EDC, size: 0x00000000 }
      - { sym: __ZNSt3__116__pad_and_outputIcNS_11char_traitsIcEEEENS_19ostreambuf_iteratorIT_T0_EES6_PKS4_S8_S8_RNS_8ios_baseES4_, objAddr: 0x0000000000000680, binAddr: 0x00000001000015C0, size: 0x000006C0 }
      - { sym: __ZNSt3__14endlIcNS_11char_traitsIcEEEERNS_13basic_ostreamIT_T0_EES7_, objAddr: 0x00000000000000E0, binAddr: 0x0000000100001020, size: 0x00000110 }
      - { sym: GCC_except_table2, objAddr: 0x0000000000000D7C, binAddr: 0x0000000100001DEC, size: 0x00000000 }
      - { sym: __ZNSt3__1lsINS_11char_traitsIcEEEERNS_13basic_ostreamIcT_EES6_PKc, objAddr: 0x0000000000000090, binAddr: 0x0000000100000FD0, size: 0x00000050 }
      - { sym: __ZNSt3__111char_traitsIcE3eofEv, objAddr: 0x0000000000000D70, binAddr: 0x0000000100001CB0, size: 0x0000000B }
...

Apple stores debugging information in separate files named *.dSYM. You can run dwarfdump on those files and see the DWARF Debug Info Entries.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!