How does the linker know where is the definition of an extern function?

后端 未结 4 1094
日久生厌
日久生厌 2021-01-12 07:29

I read a few posts and concluded that extern tells compiler that \"This function exists, but the code for it is somewhere else. Don\'t panic.\" But how does the linker know

4条回答
  •  伪装坚强ぢ
    2021-01-12 07:49

    When you compile a .o file in the ELF format, you have many things on the .o file such as:

    • a .text section containing the code;
    • .data, .rodata, .rss sections containing the global variables;
    • a .symtab containing the list of the symbols (functions, global variables and others) in the .o (and their location in the file) as well as the symbols used by the .o file;
    • sections such as .rela.text which are list of relocations -- these are the modifications that the link editor (and/or the dynamic linker) will have to make in order to link the differents parts of you program together.

    On the caller side

    Let's compile a simple C file:

    extern void GrCircleDraw(int x);
    
    int foo()
    {
      GrCircleDraw(42);
      return 3;
    }
    
    int bla()
    {
      return 2;
    }
    

    with:

    gcc -o test.o test.c -c
    

    (I'm using the native compiler of my system but it will work quite the same when cross-compiling to ARM).

    You can look at the content of your .o file with:

    readelf -a test.o
    

    In the symbol table, you will find:

    Symbol table '.symtab' contains 10 entries:
       Num:    Value          Size Type    Bind   Vis      Ndx Name
         0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
    [...]
         8: 0000000000000000    21 FUNC    GLOBAL DEFAULT    1 foo
         9: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND GrCircleDraw
        10: 0000000000000015    11 FUNC    GLOBAL DEFAULT    1 bla
    

    There is one symbol for our foo functions and one for bla. The value field give their location within the .text section.

    There is one symbol for the used symbol GrCircleDraw: it is undefined because this functions is not defined in this .o file but remains to be found elsewhere.

    In the relocation table for the .text section (.rela.text) you find:

    Relocation section '.rela.text' at offset 0x260 contains 1 entries:
      Offset          Info           Type           Sym. Value    Sym. Name + Addend
    00000000000a  000900000002 R_X86_64_PC32     0000000000000000 GrCircleDraw - 4
    

    This address is within foo: the link editor will patch the instruction at this address with the address of the GrCircleDraw function.

    On the callee side

    Now let's compile ourself an implementation of GrCircleDraw:

    void GrCircleDraw(int x)
    {
    
    }
    

    Let's look at it's symbol table:

    Symbol table '.symtab' contains 9 entries:
       Num:    Value          Size Type    Bind   Vis      Ndx Name
    [...]
         8: 0000000000000000     9 FUNC    GLOBAL DEFAULT    1 GrCircleDraw
    

    It has an entry for GrCircleDraw defining its location within its .text section.

    Linking them together

    So when the link editor combines both files together it knowns:

    • which functions is defined in which .o file and their locations;
    • where in the code of the caller it must update with the address of the callee.

提交回复
热议问题