What is “object” in “object file” and why is it called this way? [duplicate]

China☆狼群 提交于 2019-12-03 01:42:40

Object files (or object code) are machine code files generated by a compiler from source code.

The difference with an executable is that the object file isn't linked, so references to functions, symbols, etc aren't defined yet (their memory addresses is basically left blank).

When you compile a C file with GCC:

gcc -Wall -o test test.c

Here you are compiling AND linking. So you'll got an executable, containing all the memory addresses references for the symbols it contains (libraries, headers, etc).

But when you do this:

gcc -Wall -o test.o -c test.c

You'll produce and object file. It's also machine code, but it will need to be linked in order to produce an executable, or a library.

When you have a project with many C files (for instance), you'll compile each one into object code, and then you will link all object files together in order to produce the final product.

For instance:

gcc -Wall -o foo.o -c foo.c              // Object file for foo.c
gcc -Wall -o bar.o -c bar.c              // Object file for bar.c
gcc -Wall -o main.o -c main.c            // Object file for main.c
gcc -Wall -o software foo.o bar.o main.o // Executable (foo + bar + main)

The term object stands here for sequences of unlinked machine code (basically). An object file contains objects.

You asked: why is this call that way. I can't really answer. Why is "blue" named "blue"? ; )

It's just the term used since... well, decades...

For information, the GCC Internals documentation only defines object code as:

The “source code” for a work means the preferred form of the work for making modifications to it. “Object code” means any non-source form of a work.

Pretty vague about the historical reason...

I simply hope you now understand better what is an object file. I think it's more important than knowing why it's called like that, as words are just, well, words...

I believe the name has something to do with making a distinction between:

  • code for humans -- source code
  • code for machines -- object code

Object files contain:

  • Header information: overall information about the file, such as the size of the code, name of the source file it was translated from, and creation date.
  • Object code: Binary instructions and data generated by a compiler or assembler.
  • Relocation: A list of the places in the object code that have to be fixed up when the linker changes the addresses of the object code.
  • Symbols: Global symbols defined in this module, symbols to be imported from other modules or defined by the linker.
  • Debugging information: Other information about the object code not needed for linking but of use to a debugger. This includes source file and line number information, local symbols, descriptions of data structures used by the object code such as C structure definitions.

Source: here

An object file is binary representation of source(text) file. It's a collection of various sections segragating type of data in:

  • text section
  • data section
  • stack
  • heap

Depending on your compiler/environment these may differ.

E.g. on *nix systems:

objdump -d a.out <--- provide we compiled a.cpp

disassembly of section .init:

08048278 <_init>:
 8048278:       55                      push   %ebp
 8048279:       89 e5                   mov    %esp,%ebp
 804827b:       83 ec 08                sub    $0x8,%esp
 804827e:       e8 61 00 00 00          call   80482e4 <call_gmon_start>
 8048283:       e8 b3 00 00 00          call   804833b <frame_dummy>
 8048288:       e8 9f 01 00 00          call   804842c <__do_global_ctors_aux>
 804828d:       c9                      leave
 804828e:       c3                      ret
Disassembly of section .plt:

08048290 <puts@plt-0x10>:
 8048290:       ff 35 78 95 04 08       pushl  0x8049578
 8048296:       ff 25 7c 95 04 08       jmp    *0x804957c
 804829c:       00 00                   add    %al,(%eax)
        ...

080482a0 <puts@plt>:
 80482a0:       ff 25 80 95 04 08       jmp    *0x8049580
 80482a6:       68 00 00 00 00          push   $0x0
 80482ab:       e9 e0 ff ff ff          jmp    8048290 <_init+0x18>

080482b0 <__libc_start_main@plt>:
 80482b0:       ff 25 84 95 04 08       jmp    *0x8049584
 80482b6:       68 08 00 00 00          push   $0x8
 80482bb:       e9 d0 ff ff ff          jmp    8048290 <_init+0x18>
Disassembly of section .text:

The various call commands here are then liked to the various libraries to call the actual functions.

According to the page you linked, Each sequence, or object, typically contains instructions for the host machine to accomplish some task, possibly accompanied by related data and metadata (e.g. relocation information, stack unwinding information, comments, program symbols, debugging or profiling information).

Basically, each object in the object file is a function, and the relevant info for the linker to include it into the full program.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!