Will C++ linker automatically inline functions (without “inline” keyword, without implementation in header)?

前端 未结 8 1259
余生分开走
余生分开走 2021-02-07 06:42

Will the C++ linker automatically inline \"pass-through\" functions, which are NOT defined in the header, and NOT explicitly requested to be \"inlined\" through the

8条回答
  •  难免孤独
    2021-02-07 07:23

    Here's a quick test of your example (with a MyA::foo() implementation that simply returns 42). All these tests were with 32-bit targets - it's possible that different results might be seen with 64-bit targets. It's also worth noting that using the -flto option (GCC) or the /GL option (MSVC) results in full optimization - wherever MyB::foo() is called, it's simply replaced with 42.

    With GCC (MinGW 4.5.1):

    gcc -g -O3 -o test.exe myb.cpp mya.cpp test.cpp
    

    the call to MyB::foo() was not optimized away. MyB::foo() itself was slightly optimized to:

    Dump of assembler code for function MyB::foo() const:
       0x00401350 <+0>:     push   %ebp
       0x00401351 <+1>:     mov    %esp,%ebp
       0x00401353 <+3>:     sub    $0x8,%esp
    => 0x00401356 <+6>:     leave
       0x00401357 <+7>:     jmp    0x401360 
    

    Which is the entry prologue is left in place, but immediately undone (the leave instruction) and the code jumps to MyA::foo() to do the real work. However, this is an optimization that the compiler (not the linker) is doing since it realizes that MyB::foo() is simply returning whatever MyA::foo() returns. I'm not sure why the prologue is left in.

    MSVC 16 (from VS 2010) handled things a little differently:

    MyB::foo() ended up as two jumps - one to a 'thunk' of some sort:

    0:000> u myb!MyB::foo
    myb!MyB::foo:
    001a1030 e9d0ffffff      jmp     myb!ILT+0(?fooMyAQBEHXZ) (001a1005)
    

    And the thunk simply jumped to MyA::foo():

    myb!ILT+0(?fooMyAQBEHXZ):
    001a1005 e936000000      jmp     myb!MyA::foo (001a1040)
    

    Again - this was largely (entirely?) performed by the compiler, since if you look at the object code produced before linking, MyB::foo() is compiled to a plain jump to MyA::foo().

    So to boil all this down - it looks like without explicitly invoking LTO/LTCG, linkers today are unwilling/unable to perform the optimization of removing the call to MyB::foo() altogether, even if MyB::foo() is a simple jump to MyA::foo().

    So I guess if you want link time optimization, use the -flto (for GCC) or /GL (for the MSVC compiler) and /LTCG (for the MSVC linker) options.

提交回复
热议问题