I am learning about compiling process and I know that linking is mainly used to link a binary file which contains a \'main\' function with other binary files that contain ot
You dont have a bootstrap. you are in this chicken and egg problem.
The code (for that function) is there, but there are assumptions, first and foremost you need a stack. Depending on the architecture your return address may be on that stack for example. The return value may be on that stack. The C language itself doesnt provide for that directly in the language there is always at least a little bit of assembly or some other language required in order to "bootstrap" your function. For example in ARM for gnu:
bs.s
.globl _start
_start:
mov sp,#0x8000
bl main
b .
so.c
int main ( void )
{
return(0);
}
For ARM the function is complete the instructions dont need to be modified by the linker. but there is no address space defined, either specified or the disassembler assumes zero as the address for this object, but it is an object not a loadable binary.
00000000 <main>:
0: e3a00000 mov r0, #0
4: e12fff1e bx lr
now if we add the bootstrap and link to some address we get a real, executable, program
00008000 <_start>:
8000: e3a0d902 mov sp, #32768 ; 0x8000
8004: eb000000 bl 800c <main>
8008: eafffffe b 8008 <_start+0x8>
0000800c <main>:
800c: e3a00000 mov r0, #0
8010: e12fff1e bx lr
It doesnt mean one couldnt craft an operating system nor an environment where you could load functions in this way, using the compilers object output. But that is the reason for the word chain, tool chain. Compiler makes assembly language, the assembler assembles the assembly language, combined with other necessary objects (bootstrap plus compiler libraries plus C libraries, etc) the linker defines the address spaces for everything and modifies the code/data as needed to resolve externals. A sequence or chain of events to get the final result.
Even the most basic commands like exit
aren't directly in the language and need to be linked.
http://en.cppreference.com/w/c/program/exit
Read Levine's Linkers & Loaders.
Read about ELF.
Try compiling with gcc -v
(you'll see what are the actual programs used: cc1
to compile C code into some assembler, as
to assemble that into some object file, ld & collect2 to link). Look also at the generated assembler file with gcc -S -fverbose-asm -O
. Notice that gcc
knows about (and compiles specially) the main
function. And the starting point of your executable is provided by some crt0, etc (it is not main
but some _start
routine coded in assembler which calls your main
....).
Object files are not the same as executables. The executable contains stuff like crt0 and the C standard library, or some way to dynamically link it as a shared object (and you need to link your source.o
-compiled from your empty main
in source.c
- into an executable because of that).
On Linux, play with objdump(1) & readelf(1) (on some existing binaries, and also on your source.o
object file)
See also elf(5), execve(2), ld-linux(8), Linux assembly howto, syscalls(2), Advanced Linux Programming, Operating Systems: Three Easy Pieces, and (to understand about libc.so
) Drepper's How To Write Shared Libraries, the Dragon Book ...
(you need to read entire books to understand the details; I gave some references)
Look also into Common Lisp & SBCL. Its compiler has a very different model (really different from C).