How does a C program get started?
Probably the best information for your question can be found in the below mentioned link http://dbp-consulting.com/tutorials/debugging/linuxProgramStartup.html, the best one I have come across till date.
Eventually it is operating system. Usually there is some medium between real entry point and main function, this is inserted by compiler linker.
Some details (related to Windows): There is header in PE file called IMAGE_OPTIONAL_HEADER
which has the field AddressOfEntryPoint
, which is in turn address of the first code byte in the file that will be executed.
The operating system calls the main()
function. Actually, it usually calls something else named a strange thing like _init
. The C compiler links a standard library to every application which provides this operating system defined entry point and then calls main()
.
Edit: Obviously that was not detailed and correct enough for some people.
The Executable and Linkable Format (ELF) which many Unix OS's use defines an entry point address. That is where the program begins to run after the OS finishes its exec()
call. On a Linux system this is _init.
From objdump -d:
Disassembly of section .init:
08049f08 <_init>:
8049f08: 55 push %ebp
8049f09: 89 e5 mov %esp,%ebp
8049f0b: 83 ec 08 sub $0x8,%esp
8049f0e: e8 a1 05 00 00 call 804a4b4 <call_gmon_start>
8049f13: e8 f8 05 00 00 call 804a510 <frame_dummy>
8049f18: e8 d3 50 00 00 call 804eff0 <__do_global_ctors_aux>
8049f1d: c9 leave
8049f1e: c3 ret
From readelf -d:
0x00000001 (NEEDED) Shared library: [libstdc++.so.6]
0x00000001 (NEEDED) Shared library: [libm.so.6]
0x00000001 (NEEDED) Shared library: [libgcc_s.so.1]
0x00000001 (NEEDED) Shared library: [libpthread.so.0]
0x00000001 (NEEDED) Shared library: [libc.so.6]
0x0000000c (INIT) 0x8049f08
0x0000000d (FINI) 0x804f018
0x00000004 (HASH) 0x8048168
0x00000005 (STRTAB) 0x8048d8c
0x00000006 (SYMTAB) 0x804867c
0x0000000a (STRSZ) 3313 (bytes)
0x0000000b (SYMENT) 16 (bytes)
0x00000015 (DEBUG) 0x0
0x00000003 (PLTGOT) 0x8059114
0x00000002 (PLTRELSZ) 688 (bytes)
0x00000014 (PLTREL) REL
0x00000017 (JMPREL) 0x8049c58
0x00000011 (REL) 0x8049be0
0x00000012 (RELSZ) 120 (bytes)
0x00000013 (RELENT) 8 (bytes)
0x6ffffffe (VERNEED) 0x8049b60
0x6fffffff (VERNEEDNUM) 3
0x6ffffff0 (VERSYM) 0x8049a7e
0x00000000 (NULL) 0x0
You can see that INIT is equal to the address of _init.
The code for frame_dummy and __do_global_ctors_aux is in a set of files named crtbegin.o and crtend.o (and variants of those names). These are part of GCC. That code does various things necessary for a C program like setting up stdin, stdout, global and static variables and other things.
The following article describes quite well what it does in Linux (taken from an answer below with less votes): http://dbp-consulting.com/tutorials/debugging/linuxProgramStartup.html
I believe someone else's answer already described what Windows does.
The operating system calls main. There will be an address in the relocatable executable that points at the location of main (See the Unix ABI for more information).
But, who calls the operating system?
The central processing unit, on the "RESET" signal, (which is also asserted at power on), will begin looking in some ROM at a given address (say, 0xffff) for its instructions.
Typically there will be some sort of jump instruction out to the BIOS, which gets the memory chips configured, the basic hard drive drivers loaded, etc, etc. Then the Boot Sector of the hard drive is read, and the next bootloader is started, which loads the file containing the basic information of how to read, say, an NTFS partition and how to read the kernel file itself. The kernel environment will be set up, the kernel loaded, and then - and then! - the kernel will be jumped to for execution.
After all that hard work has been done, the kernel can then proceed to load our software.
http://coding.derkeiler.com/Archive/C_CPP/comp.lang.c/2008-04/msg04617.html
Note that in addition to the answers already posted, it is also possible for you to call main
yourself. Generally this is a bad idea reserved for obfuscated code.