问题
I have a question about dynamic linking on Linux. Consider the following disassembly of an ARM binary.
8300 <printf@plt-0x40>:
....
8320: e28fc600 add ip, pc, #0, 12
8324: e28cca08 add ip, ip, #8, 20 ; 0x8000
8328: e5bcf344 ldr pc, [ip, #836]! ; 0x344
....
83fc <main>:
...
8424:ebffffbd bl 8320 <_init+0x2c>
Main function calls printf at 8424: bl 8320. 8320 is an address in the .plt shown above. Now the code in .plt makes call to dynamic linker to invoke printf routine. My question is how the dynamic linker will be able to say that it is a call to printf?
回答1:
TLDR; The PLT calls the dynamic linker by passing:
the address of the GOT entry in IP (
&PLTGOT[n+3]
);&PLTGOT[2]
is in LR;
Moreover PLTGOT[1]
identifies the shared-object/executable.
The dynamic linker use this to find the relocation entry (plt_relocation_table[n]
) and thus the symbol (printf
).
Explanation of the PLT entry code
This is explained (somehow) in section A.3 of ELF for ARM:
8320: e28fc600 add ip, pc, #0, 12 8324: e28cca08 add ip, ip, #8, 20 ; 0x8000 8328: e5bcf344 ldr pc, [ip, #836]! ; 0x344
Which are explained by:
ADD ip, pc, #-8:PC_OFFSET_27_20:__PLTGOT(X) ; R_ARM_ALU_PC_G0_NC(__PLTGOT(X)) ADD ip, ip, #-4:PC_OFFSET_19_12: __PLTGOT(X) ;R_ARM_ALU_PC_G1_NC(__PLTGOT(X)) LDR pc, [ip, #0:PC_OFFSET_11_0:__PLTGOT(X)]! ; R_ARM_LDR_PC_G2(__PLTGOT(X))
Those instructions do two things:
they compute the address of the GOT entry as an offset from PC and store it in the IP register;
they jump to this GOT entry.
The spec notes that:
The write-back on the final LDR ensures that ip contains the address of the PLTGOT entry. This is critical to incremental dynamic linking.
The "write-back" is the use of "!" in the last instruction: this is used to update IP register with the final offset (#836). This way IP contains the addess of the GOT entry at the end of the PLT entry.
The dynamic linker has the address of the GOT entry in IP:
it can find the shared-object or executable;
it can find the correct relocation entry.
This relocation entry references the symbol of target function (printf
in your case):
Offset Info Type Sym. Value Sym. Name 0001066c 00000116 R_ARM_JUMP_SLOT 00000000 printf
The Base Platform ABI for the ARM architecture notes that:
When the platform supports lazy function binding (as ARM Linux does) this ABI requires ip to address the corresponding PLTGOT entry at the point where the PLT calls through it. (The PLT is requir ed to behave as if it ended with LDR pc, [ip]).
Finding the relocation entry from the GOT
Now the way the relocation entry is found from the GOT address is not clear. Binary search could be used but is would not be convenient. The GNU ld.so does it like this (glibc/sysdeps/arm/dl-trampoline.S):
dl_runtime_resolve:
cfi_adjust_cfa_offset (4)
cfi_rel_offset (lr, 0)
@ we get called with
@ stack[0] contains the return address from this call
@ ip contains &GOT[n+3] (pointer to function)
@ lr points to &GOT[2]
@ Save arguments. We save r4 to realign the stack.
push {r0-r4}
cfi_adjust_cfa_offset (20)
cfi_rel_offset (r0, 0)
cfi_rel_offset (r1, 4)
cfi_rel_offset (r2, 8)
cfi_rel_offset (r3, 12)
@ get pointer to linker struct
ldr r0, [lr, #-4]
@ prepare to call _dl_fixup()
@ change &GOT[n+3] into 8*n NOTE: reloc are 8 bytes each
sub r1, ip, lr
sub r1, r1, #4
add r1, r1, r1
[...]
The address of the second GOT entry is in LR. I guess this is donebyt
.PLT0
:00015b84 : 15b84: e52de004 push {lr} ; (str lr, [sp, #-4]!) 15b88: e59fe004 ldr lr, [pc, #4] ; 15b94 15b8c: e08fe00e add lr, pc, lr 15b90: e5bef008 ldr pc, [lr, #8]! 15b94: 0012f46c andseq pc, r2, ip, ror #8
From those two GOT addresses, the dynamic linker can find the GOT offset and the offset in the PLT relocation table.
From
&GOT[2]
, the dynamic linker can find the second entry of the PLTGOT (GOT[1]
) which contains the address of the linker struct (a reference used by the dynamic linker to recosgnise this shared-object/executable).
I don't where this is specified: it does not seem to be part of the base ARM ABI spec.
回答2:
.rela.plt
contains the address of printf to inform the dynamic linker from where to locate the printf
check this link for details very soft to digest https://www.technovelty.org/linux/plt-and-got-the-key-to-code-sharing-and-dynamic-libraries.html. This article also clarify about process of variables to be accessed through Shared libraries first and then functions.
回答3:
The process of dynamic linking is described in great detail here.
TL;DR: at static link time, ld
creates a set of tables in special sections such as .rel.dyn
, .rel.plt
, etc., which tell the runtime loader what to do at runtime.
You can examine these tables with nm -D
, readelf -Wr
, objdump -R
, etc.
来源:https://stackoverflow.com/questions/32776395/how-the-dynamic-linker-determines-which-routine-to-call-on-linux