How does the linker find the main function?

空扰寡人 提交于 2021-02-07 06:25:06

问题


How does the linker find the main function in an x86-64 ELF-format executable?


回答1:


A very generic overview, the linker assigns the address to the block of code identified by the symbol main. As it does for all the symbols in your object files.

Actually, it doesn't assign a real address but assigns an address relative to some base which will get translated to a real address by the loader when the program is executed.

The actual entry point is not likely main but some symbol in the crt that calls main. LD by default looks for the symbol start unless you specify something different.

The linked code ends up in the .text section of the executable and could look something like this (very simplified):

Address | Code
1000      someFunction
...
2000      start
2001        call 3000
...
3000      main
...

When the linker writes the ELF header it would specify the entry point as address 2000.

You can get the relative address of main by dumping the contents of the executable with something like objdump. To get the actual address at runtime you can just read the symbol funcptr ptr = main; where funcptr is defined as a pointer to a function with the signature of main.

typedef int (*funcptr)(int argc, char* argv[]);

int main(int argc, char* argv[])
{
    funcptr ptr = main;
    printf("%p\n", ptr);
    return 0;
}

The address of main will be correctly resolved regardless if symbols have been stripped since the linker will first resolve the symbol main to its relative address.

Use objdump like this:

$ objdump -f funcptr.exe 

funcptr.exe:     file format pei-i386
architecture: i386, flags 0x0000013a:
EXEC_P, HAS_DEBUG, HAS_SYMS, HAS_LOCALS, D_PAGED
start address 0x00401000

Looking for main specifically, on my machine I get this:

$ objdump -D funcptr.exe | grep main
  40102c:       e8 af 01 00 00          call   4011e0 <_cygwin_premain0>
  401048:       e8 a3 01 00 00          call   4011f0 <_cygwin_premain1>
  401064:       e8 97 01 00 00          call   401200 <_cygwin_premain2>
  401080:       e8 8b 01 00 00          call   401210 <_cygwin_premain3>
00401170 <_main>:
  401179:       e8 a2 00 00 00          call   401220 <___main>
004011e0 <_cygwin_premain0>:
004011f0 <_cygwin_premain1>:
00401200 <_cygwin_premain2>:
00401210 <_cygwin_premain3>:
00401220 <___main>:

Note that I am on Windows using Cygwin so your results will differ slightly. It looks like main lives at 00401170 for me.




回答2:


On Binutils, it is determined by either:

  • -e CLI option
  • linker script

You can view your linker script with:

ld --verbose

Mine contains:

ENTRY(_start)

Then at link time, glibc provided object files like crt1.o that contain the _start symbol are passed to the linker together with your main.o.

Those object files do some setup for you like argv, and then call your main function.

You can see those extra object files being sneaked in with gcc -v.

This is documented at: https://sourceware.org/binutils/docs/ld/Entry-Point.html#Entry-Point

The first instruction to execute in a program is called the entry point. You can use the ENTRY linker script command to set the entry point. The argument is a symbol name:

 ENTRY(symbol)

There are several ways to set the entry point. The linker will set the entry point by trying each of the following methods in order, and stopping when one of them succeeds:

  • the `-e' entry command-line option;
  • the ENTRY(symbol) command in a linker script;
  • the value of a target specific symbol, if it is defined; For many targets this is start, but PE and BeOS based systems for example check a list of possible entry symbols, matching the first one found.
  • the address of the first byte of the `.text' section, if present;
  • The address 0.

See also: is there a GCC compiler/linker option to change the name of main?



来源:https://stackoverflow.com/questions/17708649/how-does-the-linker-find-the-main-function

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!