问题
I recently started learning assembly language for the Intel x86-64 architecture using YASM. While solving one of the tasks suggested in a book (by Ray Seyfarth) I came to following problem:
When I place some characters into a buffer in the .bss section, I still see an empty string while debugging it in gdb. Placing characters into a buffer in the .data section shows up as expected in gdb.
segment .bss
result resb 75
buf resw 100
usage resq 1
segment .data
str_test db 0, 0, 0, 0
segment .text
global main
main:
mov rbx, 'A'
mov [buf], rbx ; LINE - 1 STILL GET EMPTY STRING AFTER THAT INSTRUCTION
mov [str_test], rbx ; LINE - 2 PLACES CHARACTER NICELY.
ret
In gdb I get:
after LINE 1:
x/s &buf
, result -0x7ffff7dd2740 <buf>: ""
after LINE 2:
x/s &str_test
, result -0x601030: "A"
It looks like &buf
isn't evaluating to the correct address, so it still sees all-zeros. 0x7ffff7dd2740 isn't in the BSS of the process being debugged, according to its /proc/PID/maps
, so that makes no sense. Why does &buf
evaluate to the wrong address, but &str_test
evaluates to the right address? Neither are "global" symbols, but we did build with debug info.
Tested with GNU gdb (Ubuntu 7.10-1ubuntu2) 7.10 on x86-64 Ubuntu 15.10.
I'm building with
yasm -felf64 -Worphan-labels -gdwarf2 buf-test.asm
gcc -g buf-test.o -o buf-test
nm
on the executable shows the correct symbol addresses:
$ nm -n buf-test # numeric sort, heavily edited to omit symbols from glibc
...
0000000000601028 D __data_start
0000000000601038 d str_test
...
000000000060103c B __bss_start
0000000000601040 b result
000000000060108b b buf
0000000000601153 b usage
(editor's note: I rewrote a lot of the question because the weirdness is in gdb's behaviour, not the OP's asm!).
回答1:
glibc includes a symbol named buf
, as well.
(gdb) info variables ^buf$
All variables matching regular expression "^buf$":
File strerror.c:
static char *buf;
Non-debugging symbols:
0x000000000060108b buf <-- this is our buf
0x00007ffff7dd6400 buf <-- this is glibc's buf
gdb happens to choose the symbol from glibc over the symbol from the executable. This is why ptype buf
shows char *
.
Using a different name for the buffer avoids the problem, and so does a global buf
to make it a global symbol. You also wouldn't have a problem if you wrote a stand-alone program that didn't link libc (i.e. define _start
and make an exit system call instead of running a ret
)
Note that 0x00007ffff7dd6400
(address of buf
on my system; different from yours) is not actually a stack address. It visually looks like a stack address, but it's not: it has a different number of f
digits after the 7
. Sorry for that confusion in comments and an earlier edit of the question.
Shared libraries are also loaded near the top of the low 47 bits of virtual address space, near where the stack is mapped. They're position-independent, but a library's BSS space has to be in the right place relative to its code. Checking /proc/PID/maps
again more carefully, gdb's &buf
is in fact in the rwx block of anonymous memory (not mapped to any file) right next to the mapping for libc-2.21.so
.
7ffff7a0f000-7ffff7bcf000 r-xp 00000000 09:7f 17031175 /lib/x86_64-linux-gnu/libc-2.21.so
7ffff7bcf000-7ffff7dcf000 ---p 001c0000 09:7f 17031175 /lib/x86_64-linux-gnu/libc-2.21.so
7ffff7dcf000-7ffff7dd3000 r-xp 001c0000 09:7f 17031175 /lib/x86_64-linux-gnu/libc-2.21.so
7ffff7dd3000-7ffff7dd5000 rwxp 001c4000 09:7f 17031175 /lib/x86_64-linux-gnu/libc-2.21.so
7ffff7dd5000-7ffff7dd9000 rwxp 00000000 00:00 0 <--- &buf is in this mapping
...
7ffffffdd000-7ffffffff000 rwxp 00000000 00:00 0 [stack] <---- more FFs before the first non-FF than in &buf.
A normal call
instruction with a rel32 encoding can't reach a library function, but it doesn't need to because GNU/Linux shared libraries have to support symbol interposition, so call
s to library functions actually jump to the PLT, where an indirect jmp
(with a pointer from the GOT) goes to the final destination.
来源:https://stackoverflow.com/questions/39220351/gdb-behaves-differently-for-symbols-in-the-bss-vs-symbols-in-data