What happens during a “relocation has invalid symbol index” error?

后端 未结 3 1510
孤独总比滥情好
孤独总比滥情好 2021-02-13 18:57

Here is a test reproducing the problem:

$ echo \"void whatever() {}\" > prog.c
$ gcc prog.c

This produces the following error on GCC 4.8.4:<

相关标签:
3条回答
  • 2021-02-13 19:17

    C Program Features (Unix-like)

    • every program is compiled separately into elf format
    • c program can use external variable/function reference, which is linked later
    • main is not the start of program as you originally thought, c lib has a starter program (crt1.o) which has a _start program which will invoke our main and do cleaning job after main
    • concludes above statement, we can know that even a very simple program as OP showed need to be linked

    ELF Format

    ELF has two headers, as following shows:

    • section header -- used to link multiple elf to make process image
    • program header -- used to load process image

    Here we only focus on section header structure:

        mapping<var_name, offset, size...>
        // and special cases
        mapping<external_var_name, offset, size...>
    

    Every program is compiled separately, which means address allocation is similar (In the early version of linux, every compiled program start with same virtual address -- 0x08000000, and many attacks can make use of this, so it changes to adding some random delta to address to alleviate the problem), so there may exists some overlay area. This is why address relocation needed.

    Relocation

    The relocation info (offset, value etc) is stored in .rel.* section:

        Relocation section '.rel.text' at offset 0x7a4 contains 2 entries:
         Offset     Info    Type            Sym.Value  Sym. Name
        0000000d  00000e02 R_386_PC32        00000000   main
        00000015  00000f02 R_386_PC32        00000000   exit
    
        Relocation section '.rel.debug_info' at offset 0x7b4 contains 43 entries:
         Offset     Info    Type            Sym.Value  Sym. Name
        00000006  00000601 R_386_32          00000000   .debug_abbrev
        0000000c  00000901 R_386_32          00000000   .debug_str
    

    When the linker want to set the address of main in the process of relocation, it can't find a symbol in your compiled elf file, so it complains that and stop the linking process.

    Example

    Here is the simplified version of os implementations, start.c corresponds to crt1.o's source code:

        int entry(char *); // corresponds to main
    
        void _start(char *args) {
            entry(args);
            exit();
        }
    

    Ref

    • ELF
    • Relocation
    0 讨论(0)
  • 2021-02-13 19:26

    We can break this down into two parts:

    Undefined reference to `main'

    /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../x86_64-linux-gnu/crt1.o: In function `_start':
    (.text+0x20): undefined reference to `main'
    collect2: error: ld returned 1 exit status
    

    This is simply because the C Runtime library (crt1.o) is trying to call your (missing) main() function. There is a good overview of the various C Runtime files here / here.

    Relocation X has invalid symbol index Y

    Note: This has been a bit of mission (learning opportunity) for me - I've taken a few days to research and understand in the limited free time that I have. It's something that I've wondered about for a long time but never looked into... Hopefully my understanding is correct (though clearly not complete - I'll update it if I can).

    This is a little more complex, and hidden away.

    Just by reading the message, we can gleen that it's related to debug information (the .debug_info and .debug_line sections of crt1.o's debug file is mentioned). Note the /usr/lib/debug/ path, which just contains debug information, the other crt1.o is a "stripped" file...

    The format string is found in the binutils project, specifically bfd/elfcode.h. BFD being Binary File Descriptor - the GNU way to handle object files across a number of system architectures.

    BFD is an intermediate format used for binary files - GCC will employ BFD before it finally writes an a.out, ELF, or other binary.

    Looking into the manual, we can find some interesting snippets of knowledge:

    [...] with each entry in the hash table the a.out linker keeps the index the symbol has in the final output file (this index number is used so that when doing a relocateable link the symbol index used in the output file can be quickly filled in when copying over a reloc). [source]

    The standard records contain only an address, a symbol index, and a type field. [source]

    This means that these errors are issued due to relocations that are related to specific (missing?) 'symbols'. A 'symbol' in this context is any named 'thing' - e.g: functions and variables.

    As these 'invalid symbols' appear to be resolved by simply declaring main(), I would guess that some (all?) of these symbol indexes are derived from main(), its debug information, and/or relations.

    I can't tell you what is supposed to be at the symbol indexes mentioned (2, 11, 12, 13, 21), but it is interesting that my tests yielded the same list of symbol indexes.

    Running ld with crt1.o alone gives us similar output:

    $ ld /usr/lib/x86_64-linux-gnu/crt1.o
    ld: /usr/lib/debug/usr/lib/x86_64-linux-gnu/crt1.o(.debug_info): relocation 0 has invalid symbol index 11
    ld: /usr/lib/debug/usr/lib/x86_64-linux-gnu/crt1.o(.debug_info): relocation 1 has invalid symbol index 12
    ld: /usr/lib/debug/usr/lib/x86_64-linux-gnu/crt1.o(.debug_info): relocation 2 has invalid symbol index 2
    ld: /usr/lib/debug/usr/lib/x86_64-linux-gnu/crt1.o(.debug_info): relocation 3 has invalid symbol index 2
    ld: /usr/lib/debug/usr/lib/x86_64-linux-gnu/crt1.o(.debug_info): relocation 4 has invalid symbol index 11
    ld: /usr/lib/debug/usr/lib/x86_64-linux-gnu/crt1.o(.debug_info): relocation 5 has invalid symbol index 13
    ld: /usr/lib/debug/usr/lib/x86_64-linux-gnu/crt1.o(.debug_info): relocation 6 has invalid symbol index 13
    ld: /usr/lib/debug/usr/lib/x86_64-linux-gnu/crt1.o(.debug_info): relocation 7 has invalid symbol index 13
    ld: /usr/lib/debug/usr/lib/x86_64-linux-gnu/crt1.o(.debug_info): relocation 8 has invalid symbol index 12
    ld: /usr/lib/debug/usr/lib/x86_64-linux-gnu/crt1.o(.debug_info): relocation 9 has invalid symbol index 13
    ld: /usr/lib/debug/usr/lib/x86_64-linux-gnu/crt1.o(.debug_info): relocation 10 has invalid symbol index 13
    ld: /usr/lib/debug/usr/lib/x86_64-linux-gnu/crt1.o(.debug_info): relocation 11 has invalid symbol index 13
    ld: /usr/lib/debug/usr/lib/x86_64-linux-gnu/crt1.o(.debug_info): relocation 12 has invalid symbol index 13
    ld: /usr/lib/debug/usr/lib/x86_64-linux-gnu/crt1.o(.debug_info): relocation 13 has invalid symbol index 13
    ld: /usr/lib/debug/usr/lib/x86_64-linux-gnu/crt1.o(.debug_info): relocation 14 has invalid symbol index 13
    ld: /usr/lib/debug/usr/lib/x86_64-linux-gnu/crt1.o(.debug_info): relocation 15 has invalid symbol index 13
    ld: /usr/lib/debug/usr/lib/x86_64-linux-gnu/crt1.o(.debug_info): relocation 16 has invalid symbol index 13
    ld: /usr/lib/debug/usr/lib/x86_64-linux-gnu/crt1.o(.debug_info): relocation 17 has invalid symbol index 13
    ld: /usr/lib/debug/usr/lib/x86_64-linux-gnu/crt1.o(.debug_info): relocation 18 has invalid symbol index 13
    ld: /usr/lib/debug/usr/lib/x86_64-linux-gnu/crt1.o(.debug_info): relocation 19 has invalid symbol index 21
    ld: /usr/lib/debug/usr/lib/x86_64-linux-gnu/crt1.o(.debug_line): relocation 0 has invalid symbol index 2
    /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../x86_64-linux-gnu/crt1.o: In function `_start':
    (.text+0x12): undefined reference to `__libc_csu_fini'
    /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../x86_64-linux-gnu/crt1.o: In function `_start':
    (.text+0x19): undefined reference to `__libc_csu_init'
    /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../x86_64-linux-gnu/crt1.o: In function `_start':
    (.text+0x20): undefined reference to `main'
    /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../x86_64-linux-gnu/crt1.o: In function `_start':
    (.text+0x25): undefined reference to `__libc_start_main'
    
    0 讨论(0)
  • 2021-02-13 19:41

    Instead of compiling the code directly, go through all the stages of compilation to figure out where the error is arising (as far as I know, such errors occur during linking). Following gcc arguments will be helpful:

    • -E Preprocess only; do not compile, assemble or link
    • -S Compile only; do not assemble or link
    • -c Compile and assemble, but do not link

    Now:

    gcc -E prog.c
    gcc -S prog.c
    gcc -c prog.c
    

    With the program/code you have mentioned, all these steps are working perfect with gcc 4.8.4. But, during linking, when you compile using gcc prog.c, the compiler is unable to link with respective library, as it was not mentioned. Also, we have no main function in the prog.c file. So, we need to indicate -nostartfiles switch. Hence, you can compile prog.c as:

    gcc prog.c -lc -nostartfiles
    

    This produces the warning:

    /usr/bin/ld: warning: cannot find entry symbol _start; defaulting to 00000000004002a3

    This is because of the sequence. i.e., init calls _start function and the _start function calls main function. This warning means that the _start function is unable to locate main function, where the init call is unable to locate _start. Please note that this is just a warning. In order to avoid this warning, we need to change the command to compile without warnings as follows.

    gcc prog.c -lc --entry whatever -nostartfiles
    

    With this command, we are instructing the kernel to compile prog.c using gcc by linking the libc.so library with the starting point as the function whatever, where this code contains no main function.

    This is the context with gcc 4.8.4, which I've compiled on.

    Coming to the case of gcc 6.2.0, I think all these linking stuff is taken care by the compiler itself. Hence, you can simply mention the compiling command as shown below.

    gcc -c prog.c -nostartfiles
    

    If it produces any other errors or warnings, you can use the switches mentioned above.

    Also, note that crt0 through crtN (N depends on the ELF file) are executed before the init calls _start, which gives the metadata about the memory and other machine dependent parameters. The linking errors shown as

    /usr/bin/ld: /usr/lib/debug/usr/lib/x86_64-linux-gnu/crt1.o(.debug_info): relocation 0 has invalid symbol index 11

    do not provide complete information for rectifying the issue, as machines are not as smart as human beings in identifying the point of error.

    This produces a complete executable file. Please do notice that such code (without main function) is used when we are working on libraries/modules within a project.

    All the data provided is done with step-by-step analysis. You can recreate all the steps mentioned. Hope this cleared your doubt. Good day!

    0 讨论(0)
提交回复
热议问题