At what moment is memory typically allocated for local variables in C++?

后端 未结 5 817
故里飘歌
故里飘歌 2021-01-01 11:04

I\'m debugging a rather weird stack overflow supposedly caused by allocating too large variables on stack and I\'d like to clarify the following.

Suppose I have the

相关标签:
5条回答
  • 2021-01-01 11:14

    I checked on LLVM:

    void doSomething(char*,char*);
    
    void function(bool b)
    {
        char b1[1 * 1024];
        if( b ) {
           char b2[1 * 1024];
           doSomething(b1, b2);
        } else {
           char b3[512 * 1024];
           doSomething(b1, b3);
        }
    }
    

    Yields:

    ; ModuleID = '/tmp/webcompile/_28066_0.bc'
    target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"
    target triple = "x86_64-unknown-linux-gnu"
    
    define void @_Z8functionb(i1 zeroext %b) {
    entry:
      %b1 = alloca [1024 x i8], align 1               ; <[1024 x i8]*> [#uses=1]
      %b2 = alloca [1024 x i8], align 1               ; <[1024 x i8]*> [#uses=1]
      %b3 = alloca [524288 x i8], align 1            ; <[524288 x i8]*> [#uses=1]
      %arraydecay = getelementptr inbounds [1024 x i8]* %b1, i64 0, i64 0 ; <i8*> [#uses=2]
      br i1 %b, label %if.then, label %if.else
    
    if.then:                                          ; preds = %entry
      %arraydecay2 = getelementptr inbounds [1024 x i8]* %b2, i64 0, i64 0 ; <i8*> [#uses=1]
      call void @_Z11doSomethingPcS_(i8* %arraydecay, i8* %arraydecay2)
      ret void
    
    if.else:                                          ; preds = %entry
      %arraydecay6 = getelementptr inbounds [524288 x i8]* %b3, i64 0, i64 0 ; <i8*> [#uses=1]
      call void @_Z11doSomethingPcS_(i8* %arraydecay, i8* %arraydecay6)
      ret void
    }
    
    declare void @_Z11doSomethingPcS_(i8*, i8*)
    

    You can see the 3 alloca at the top of the function.

    I must admit I am slightly disappointed that b2 and b3 are not folded together in the IR, since only one of them will ever be used.

    0 讨论(0)
  • 2021-01-01 11:16

    This optimization is known as "stack coloring", because you're assigning multiple stack objects to the same address. This is an area that we know LLVM can improve. Currently LLVM only does this for stack objects created by the register allocator for spill slots. We'd like to extend this to handle user stack variables as well, but we need a way to capture the lifetime of the value in IR.

    There is a rough sketch of how we plan to do this here: http://nondot.org/sabre/LLVMNotes/MemoryUseMarkers.txt

    Implementation work on this is underway, several pieces are implemented in mainline.

    -Chris

    0 讨论(0)
  • 2021-01-01 11:19

    On many platforms/ABIs, the entire stackframe (including memory for every local variable) is allocated when you enter the function. On others, it's common to push/pop memory bit by bit, as it is needed.

    Of course, in cases where the entire stackframe is allocated in one go, different compilers might still decide on different stack frame sizes. In your case, some compilers would miss an optimization opportunity, and allocate unique memory for every local variable, even the ones that are in different branches of the code (both the 1 * 1024 array and the 512 * 1024 one in your case), where a better optimizing compiler should only allocate the maximum memory required of any path through the function (the else path in your case, so allocating a 512kb block should be enough). If you want to know what your platform does, look at the disassembly.

    But it wouldn't surprise me to see the entire chunk of memory allocated immediately.

    0 讨论(0)
  • 2021-01-01 11:28

    Your local (stack) variables are allocated in the same space as stack frames. When the function is called, the stack pointer is changed to "make room" for the stack frame. It's typically done in a single call. If you consume the stack with local variables, you'll encounter a stack overflow.

    ~512 kbytes is really too large for the stack in any case; you should allocate this on the heap using std::vector.

    0 讨论(0)
  • 2021-01-01 11:32

    As you say, it is compiler dependent, but you could consider using alloca to overcome this. The variables would still be allocated on the stack, and still automatically freed as they go out of scope, but you take control over when and if the stack space is allocated.

    While use of alloca is typically discouraged, it does have its uses in situations such as the above.

    0 讨论(0)
提交回复
热议问题