Does a function local static variable automatically incur a branch?

前端 未结 2 2072
礼貌的吻别
礼貌的吻别 2021-02-18 13:05

For example:

int foo()
{
    static int i = 0;
    return i++;
}

The variable i will only be initialized to 0 the fir

2条回答
  •  佛祖请我去吃肉
    2021-02-18 13:50

    Yes, there is a branch. Each time the function is entered, the code must check if the variable has already been initialized. But as will be explained below, you usually do not have to care about this branch.

    Example

    Check out this code:

    #include 
    
    struct Foo { Foo(){ std::cout << "FOO" << std::endl;} };
    void foo(){ static Foo foo; }
    int main(){ foo();}
    

    Now, here is the first part of assembly code that gcc4.8 generates for the foo function:

    _Z3foov:
    .LFB974:
    .cfi_startproc
    .cfi_personality 0x3,__gxx_personality_v0
    .cfi_lsda 0x3,.LLSDA974
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    pushq   %r12
    pushq   %rbx
    .cfi_offset 12, -24
    .cfi_offset 3, -32
    movl    $_ZGVZ3foovE3foo, %eax
    movzbl  (%rax), %eax
    testb   %al, %al
    jne .L7                     <------------------- FIRST CHECK
    movl    $_ZGVZ3foovE3foo, %edi
    call    __cxa_guard_acquire <------------------- LOCK    
    testl   %eax, %eax
    setne   %al
    testb   %al, %al
    je  .L7                     <------------------- SECOND CHECK
    movl    $0, %r12d
    movl    $_ZZ3foovE3foo, %edi
    

    A you see, there is a jne! Then, a guard is aquired using __cxa_guard_acquire, followed by a je. Thus, it seems that the compiler is generating the famous double checked locking pattern here.

    Will every compiler generate a branch?

    I am pretty sure the spec does NOT mandate that a branch or double checked locking must be used. It just mandates that the initialization must be thread safe. However, I do not see a way to perform a thread safe initialization without a branch. Thus, even though the spec does not mandate it, it is simply not possible with current CPU architectures to omit the branch here.

    Is the branch expensive?

    Considering whether you should care about this branch: You should definitly NOT care about this branch, since it will be correctly predicted (as it once the object is initialized the branch always takes the same route). Thus, the branch is almost free. Trying to avoid a static local variable for optimization purposes should never yield any observable performance benefit.

    Is there really no way around the branch?

    If the constructor is not observable, like simply initialization with constant values, then it may be performed eagerly at program startup and the branch is omitted. If, however, it is observable, then things get pretty tricky:

    The only possibility I see is stated in the answer of R. Martinho Fernandes (which has been deleted): The code could modify itself. I.e., simply remove the initialization code once the initialization is done. However, this is idea is impractical for the following reasons:

    1. Self-modifying code is very hard to get thread-safe.
    2. Usually, memory flagged executable is write protected so code is not allowed to rewrite itself.
    3. It is just not worth it, as the branch is not expensive (see above).

提交回复
热议问题