The LD manual does not explain what the KEEP
command does. Below is a snippet from a third-party linker script that features KEEP
. What does the
Minimal Linux IA-32 example that illustrates its usage
main.S
.section .text
.global _start
_start:
/* Dummy access so that after will be referenced and kept. */
mov after, %eax
/*mov keep, %eax*/
/* Exit system call. */
mov $1, %eax
/* Take the exit status 4 bytes after before. */
mov $4, %ebx
mov before(%ebx), %ebx
int $0x80
.section .before
before: .long 0
/* TODO why is the `"a"` required? */
.section .keep, "a"
keep: .long 1
.section .after
after: .long 2
link.ld
ENTRY(_start)
SECTIONS
{
. = 0x400000;
.text :
{
*(.text)
*(.before)
KEEP(*(.keep));
*(.keep)
*(.after)
}
}
Compile and run:
as --32 -o main.o main.S
ld --gc-sections -m elf_i386 -o main.out -T link.ld main.o
./main.out
echo $?
Output:
1
If we comment out the KEEP
line the output is:
2
If we either:
mov keep, %eax
--gc-sections
The output goes back to 1
.
Tested on Ubuntu 14.04, Binutils 2.25.
Explanation
There is no reference to the symbol keep
, and therefore its containing section .keep
.
Therefore if garbage collection is enabled and we don't use KEEP
to make an exception, that section will not be put in the executable.
Since we are adding 4 to the address of before
, if the keep
section is not present, then the exit status will be 2
, which is present on the next .after
section.
TODO: nothing happens if we remove the "a"
from .keep
, which makes it allocatable. I don't understand why that is so: that section will be put inside the .text
segment, which because of it's magic name will be allocatable.
Force the linker to keep some specific sections
SECTIONS
{
....
....
*(.rodata .rodata.*)
KEEP(*(SORT(.scattered_array*)));
}
Afaik LD keeps the symbols in the section even if symbols are not referenced. (--gc-sections).
Usually used for sections that have some special meaning in the binary startup process, more or less to mark the roots of the dependency tree.
(For Sabuncu below)
Dependency tree:
If you eliminate unused code, you analyze the code and mark all reachable sections (code+global variables + constants).
So you pick a section, mark it as "used" and see what other section it references, then you mark those section as "used", and check what they reference etc.
The section that are not marked "used" are then redundant, and can be eliminated.
Since a section can reference multiple other sections (e.g. one procedure calling three different other ones), if you would draw the result you get a tree.
Roots:
The above principle however leaves us with a problem: what is the "first" section that is always used? The first node (root) of the tree so to speak? This is what "keep()" does, it tells the linker which sections (if available) are the first ones to look at. As a consequence these are always linked in.
Typically these are sections that are called from the program loader to perform tasks related to dynamic linking (can be optional, and OS/fileformat dependent), and the entry point of the program.