about the cost of virtual function

前端 未结 7 1721
没有蜡笔的小新
没有蜡笔的小新 2021-01-11 23:41

If I call a virtual function 1000 times in a loop, will I suffer from the vtable lookup overhead 1000 times or only once?

7条回答
  •  隐瞒了意图╮
    2021-01-12 00:13

    Let's give it a try with g++ targeting x86:

    $ cat y.cpp
    struct A
      {
        virtual void not_used(int);
        virtual void f(int);
      };
    
    void foo(A &a)
      {
        for (unsigned i = 0; i < 1000; ++i)
          a.f(13);
      }
    $ 
    $ gcc -S -O3  y.cpp  # assembler output, max optimization
    $ 
    $ cat y.s
        .file   "y.cpp"
        .section    .text.unlikely,"ax",@progbits
    .LCOLDB0:
        .text
    .LHOTB0:
        .p2align 4,,15
        .globl  _Z3fooR1A
        .type   _Z3fooR1A, @function
    _Z3fooR1A:
    .LFB0:
        .cfi_startproc
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset 6, -16
        pushq   %rbx
        .cfi_def_cfa_offset 24
        .cfi_offset 3, -24
        movq    %rdi, %rbp
        movl    $1000, %ebx
        subq    $8, %rsp
        .cfi_def_cfa_offset 32
        .p2align 4,,10
        .p2align 3
    .L2:
        movq    0(%rbp), %rax
        movl    $13, %esi
        movq    %rbp, %rdi
        call    *8(%rax)
        subl    $1, %ebx
        jne .L2
        addq    $8, %rsp
        .cfi_def_cfa_offset 24
        popq    %rbx
        .cfi_def_cfa_offset 16
        popq    %rbp
        .cfi_def_cfa_offset 8
        ret
        .cfi_endproc
    .LFE0:
        .size   _Z3fooR1A, .-_Z3fooR1A
        .section    .text.unlikely
    .LCOLDE0:
        .text
    .LHOTE0:
        .ident  "GCC: (GNU) 5.3.1 20160406 (Red Hat 5.3.1-6)"
        .section    .note.GNU-stack,"",@progbits
    $
    

    The L2 label is the top of the loop. The line right after L2 seems to be loading the vpointer into rax. The call 4 lines after L2 seems to be indirect, fetching the pointer to the f() override from the vstruct.

    I'm surprised by this. I would have expected the compiler to treat the address of the f() override function as a loop invariant. It seems like gcc is making two "paranoid" assumptions:

    1. The f() override function may change the hidden vpointer in the object somehow, or
    2. The f() override function may change the contents of the vstruct somehow.

    Edit: In a separate compilation unit, I implemented A::f() and a main function with a call to foo(). I then built an executable with gcc using link-time optimization, and ran objdump on it. The virtual function call was inlined. So, perhaps this is why gcc optimization without LTO is not as ideal as one might expect.

提交回复
热议问题