问题
Consider the following C code:
#include <stdint.h>
void func(void) {
uint32_t var = 0;
return;
}
The unoptimized (i.e.: -O0
option) assembly code generated by GCC 4.7.2 for the code above is:
func:
pushl %ebp
movl %esp, %ebp
subl $16, %esp
movl $0, -4(%ebp)
nop
leave
ret
According to the stack alignment requirements of the System V ABI, the stack must be aligned by 16 bytes before every call
instruction (the stack boundary is 16 bytes by default when not changed with the option -mpreferred-stack-boundary
). Therefore, the result of ESP
modulo 16 has to be zero prior to a function call.
Bearing in mind these stack alignment requirements, I assume the following stack's status representation just before executing the leave
instruction to be right:
Size (bytes) Stack ESP mod 16 Description
-----------------------------------------------------------------------------------
| . . . |
------------------........0 at func call
4 | return address |
------------------.......12 at func entry
4 | saved EBP |
----> ------------------........8 EBP is pointing at this address
| 4 | var |
| ------------------........4
16 | | |
| 12 | |
| | |
----> ------------------........8 after allocating 16 bytes
With this representation of the stack in mind, there are two points that puzzle me:
var
is obviously not aligned on the stack to 16 bytes. This issue seems to contradict what I have read in this answer to this question (the emphasis is of my own):-mpreferred-stack-boundary=n
where the compiler tries to keep items on the stack aligned to 2^n
.In my case
-mpreferred-stack-boundary
wasn't provided, so it is set by default to 4 (i.e.: 2^4=16 bytes boundary) according to this section of GCC's documentation (I got indeed the same results with-mpreferred-stack-boundary=4
).The purpose of allocating 16 bytes on the stack (i.e.: the
subl $16, %esp
instruction) instead of allocating just 8 bytes: after allocating 16 bytes neither the stack is aligned by 16 bytes nor any memory space is spared. By allocating just 8 bytes instead, the stack gets aligned by 16-bytes and no additional 8 bytes are wasted.
回答1:
Looking at -O0
-generated machine code is usually a futile exercise. The compiler will emit whatever works, in the simplest possible way. This often leads to bizarre artifacts.
Stack alignment only refers to alignment of the stack frame. It is not directly related to the alignment of objects on the stack. GCC will allocate on-stack objects with the required alignment. This is simpler if GCC knows that the stack frame already provides sufficient alignment, but if not, GCC will use a frame pointer and perform explicit alignment.
回答2:
This answer aims to further develop some of the comments written above.
First, based on Margaret Bloom's comment, consider the following modification of the func()
function that was originally posted:
#include <stdint.h>
void bar(void);
void func(void) {
uint32_t var = 0;
bar(); // <--- function call
return;
}
Unlike the original func()
function, the redefined one contains a function call to bar()
.
The generated assembled code is this time:
func:
pushl %ebp
movl %esp, %ebp
subl $24, %esp
movl $0, -12(%ebp)
call bar
nop
leave
ret
Note that, the instruction subl $24, %esp
does align the stack by 16 bytes (the subl $16, %esp
instruction in the original func()
function didn't).
Since the redefined func()
contains a function call now (i.e.: call bar
), the stack has to be aligned by 16 bytes just before executing the call
instruction. The previous func()
called no function at all, therefore there was no need for the stack to be aligned by 16 bytes.
It is clear, that, at least, 4 bytes must be allocated on the stack for the var
variable. Allocating 4 additional bytes would be needed in order to align the stack by 16 bytes.
Someone may ask why 24 bytes are being allocated in order to align the stack, when allocating just 8 bytes would do. Well, by paraphrasing part of Ped7g's comment, this question is also answered:
Also keep in mind the C compiler is not obliged to produce optimal code in any kind of metric, including stack space usage. While it will try hard (and from playing around with gcc 4.7.2 on godbolt it looks good, the junk space is result only of the alignment), there's no language-breaking problem if it would fail and allocate 16B more junk than truly needed (especially in unoptimized code).
来源:https://stackoverflow.com/questions/47411158/understanding-stack-alignment-enforcement