Why is 0 moved to stack when using return value?

假如想象 提交于 2020-05-22 06:46:41

问题


I'm experimenting disassembling clang binaries of simple C programs (compiled with -O0), and I'm confused about a certain instruction that gets generated.

Here are two empty main functions with standard arguments, one of which returns value and other does not:

// return_void.c
void main(int argc, char** argv)
{
}

// return_0.c
int main(int argc, char** argv)
{
    return 0;
}

Now, when I disassemble their assemblies, they look reasonably different, but there's one line that I don't understand:

return_void.bin:
(__TEXT,__text) section
_main:
0000000000000000    pushq   %rbp
0000000000000001    movq    %rsp, %rbp
0000000000000004    movl    %edi, -0x4(%rbp)
0000000000000007    movq    %rsi, -0x10(%rbp)
000000000000000b    popq    %rbp
000000000000000c    retq

return_0.bin:
(__TEXT,__text) section
_main:
0000000100000f80    pushq   %rbp                
0000000100000f81    movq    %rsp, %rbp          
0000000100000f84    xorl    %eax, %eax          # We return with EAX, so we clean it to return 0
0000000100000f86    movl    $0x0, -0x4(%rbp)    # What does this mean?
0000000100000f8d    movl    %edi, -0x8(%rbp)
0000000100000f90    movq    %rsi, -0x10(%rbp)
0000000100000f94    popq    %rbp
0000000100000f95    retq

It only gets generated when I use the function is not void, so I thought that it might be another way to return 0, but when I changed the returned constant, this line didn't change at all:

// return_1.c
int main(int argc, char** argv)
{
    return 1;
}

empty_return_1.bin:
(__TEXT,__text) section
_main:
0000000100000f80    pushq   %rbp
0000000100000f81    movq    %rsp, %rbp
0000000100000f84    movl    $0x1, %eax           # Return value modified
0000000100000f89    movl    $0x0, -0x4(%rbp)    # This value is not modified
0000000100000f90    movl    %edi, -0x8(%rbp)
0000000100000f93    movq    %rsi, -0x10(%rbp)
0000000100000f97    popq    %rbp
0000000100000f98    retq

Why is this line getting generated and what is it's purpose?


回答1:


The purpose of that area is revealed by the following code

int main(int argc, char** argv)
{
    if (rand() == 42)
      return 1;

    printf("Helo World!\n");
    return 0;
}

At the start it does

movl    $0, -4(%rbp)

then the early return looks as follows

callq   rand
cmpl    $42, %eax
jne .LBB0_2
movl    $1, -4(%rbp)
jmp .LBB0_3

and then at the end it does

.LBB0_3:
movl    -4(%rbp), %eax
addq    $32, %rsp
popq    %rbp
retq

So, this area is indeed reserved to store the function return value. It doesn't appear to be terribly necessary and it is not used in optimized code, but in -O0 mode that's the way it works.




回答2:


clang is making space on the stack for the arguments (registers edi and rsi) and puts the value 0 on the stack, too, for some reason. I assume that clang compiles your code to an SSA-representation like this:

int main(int argc, char** argv)
{
    int a;

    a = 0;
    return a;
}

This would explain why a stack slot is allocated. If clang does constant propagation, too, this would explain why eax is zeroed out instead of being loaded from -4(%rbp). In general, don't think too much about dubious constructs in unoptimized assembly. After all, you forbade the compiler from removing useless code.




回答3:


According the the standard (for hosted environments), 5.1.2.2.1, main is required to returrn an int result. So do not expect defined behavior if violating this.

Furthermore, main is actually _not required to explicitly return 0; this is implicitly returned if it reaches the end of the function. (Note this is only for main, which also does not have a prototype.




回答4:


movl   $0x0,-0x4(%rbp)

This instruction stores 0 at %rbp - 4. It seems that clang allocates a hidden local variable for an implicit return value from main.

From the clang mailing list:

Yes. We allocate an implicit local variable to hold the return value; return statements then just initialize the return slot and jump to the epilogue, where the slot is loaded and returned. We don't use a phi because the control flow for getting to the epilogue is not necessarily as simple as a simple branch, due to cleanups in local scopes (like C++ destructors).

Implicit return values like main's are handled with an implicit store in the prologue.

Source: http://lists.cs.uiuc.edu/pipermail/cfe-dev/2012-February/019767.html



来源:https://stackoverflow.com/questions/31149806/why-is-0-moved-to-stack-when-using-return-value

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!