问题
section .text
global _start ;must be declared for using gcc
_start: ;tell linker entry point
mov edx, len ;message length
mov ecx, msg ;message to write
mov ebx, 1 ;file descriptor (stdout)
mov eax, 4 ;system call number (sys_write)
int 0x80 ;call kernel
mov eax, 1 ;system call number (sys_exit)
int 0x80 ;call kernel
section .data
msg db 'Hello, world!',0xa ;our dear string
len equ $ - msg ;length of our dear string
This is a basic 32-bit x86 Linux assembly code to print "Hello, World!" on the screen (standard output). Build + run it with
nasm -felf -g -Fdwarf hello.asm
gcc -g -m32 -nostdlib -static -o hello hello.o
./hello
(Editor's note: or gdb ./hello
to debug / single-step it. That's why we used nasm -g -Fdwarf
and gcc -g
. Or use layout reg
inside GDB for disassembly+register view that doesn't depend on debug symbols. See the bottom of https://stackoverflow.com/tags/x86/info)
Now I want to ask about how is this code working behind the scenes. Like what is the need for all these instructions
_start: ;tell linker entry point
mov edx, len ;message length
mov ecx, msg ;message to write
mov ebx, 1 ;file descriptor (stdout)
mov eax, 4 ;system call number (sys_write)
int 0x80 ;call kernel
mov eax, 1 ;system call number (sys_exit)
int 0x80 ;call kernel
just to print "Hello, World!" and the statement
_start:
above! Is it the main function?
and the statement
int 0x80
why is it used at all? Can you guys give me a deep explaination of the basic working of this program.
回答1:
In machine code, there are no functions. At least, the processor knows nothing about functions. The programmer can structure his code as he likes. _start
is something called a symbol which is just a name for a location in your program. Symbols are used to refer to locations whose address you don't know yet. They are resolved during linking. The symbol _start
is used as the entry point (cf. this answer) which is where the operating system jumps to start your program. Unless you specify the entry point by some other way, every program must contain _start
. The other symbols your program uses are msg
, which is resolved by the linker to the address where the string Hello, world!
resides and len
which is the length of msg
.
The rest of the program does the following things:
- Set up the registers for the system call
write(1, msg, len)
.write
has system call number 4 which is stored ineax
to let the operating system know you want system call 4. This system call writes data to a file. The file descriptor number supplied is 1 which stands for standard output. - Perform a system call using
int $0x80
. This instruction interrupts your program, the operating system picks this up and performs the function whose number is stored ineax
. It's like a function call that calls into the OS kernel. The calling convention is different from other functions, with args passed in registers. - Set up the registers for the system call
_exit(?)
. Its system call number is 1 which goes intoeax
. Sadly, the code forgets to set the argument for_exit
, which should be 0 to indicate success. Instead, whatever was inebx
before is used instead, which seems to be 1. - Perform a system call using
int $0x80
. Because_exit
ends the program, it does not return. Your program ends here.
The directive db
tells the assembler to place the following data into the program where we currently are. This places the string Hello, world!
followed by a newline into the program so we can tell the write
system call to write that string.
The line len equ $ - msg
tells the assembler than len is the difference between $ (where we currently are) and msg. This is defined so we can pass to write
how long the text we want to print is.
Everything after a semicolon (;
) in the program is a comment ignored by the assembler.
来源:https://stackoverflow.com/questions/45052162/what-is-the-explanation-of-this-x86-hello-world-using-32-bit-int-0x80-linux-syst