Why isn't this unused variable optimised away?

后端 未结 4 1638
情深已故
情深已故 2021-02-06 21:47

I played around with Godbolt\'s CompilerExplorer. I wanted to see how good certain optimizations are. My minimum working example is:

#include 

int         


        
相关标签:
4条回答
  • 2021-02-06 22:16

    std::vector<T> is a fairly complicated class that involves dynamic allocation. While clang++ is sometimes able to elide heap allocations, it is a fairly tricky optimization and you should not rely on it. Example:

    int foo() {
        int* p = new int{5};
        return *p;
    }
    
    foo():                                # @foo()
            mov     eax, 5
            ret
    

    As an example, using std::array<T> (which does not dynamically allocate) produces fully-inlined code:

    #include <array>
    
    int foo() {
        std::array v{1, 2, 3, 4, 5};
        return v[4];
    }
    
    foo():                                # @foo()
            mov     eax, 5
            ret
    

    As Marc Glisse noted in the other answer's comments, this is what the Standard says in [expr.new] #10:

    An implementation is allowed to omit a call to a replaceable global allocation function ([new.delete.single], [new.delete.array]). When it does so, the storage is instead provided by the implementation or provided by extending the allocation of another new-expression. The implementation may extend the allocation of a new-expression e1 to provide storage for a new-expression e2 if the following would be true were the allocation not extended: [...]

    0 讨论(0)
  • 2021-02-06 22:26

    N3664's change to [expr.new], cited in one answer and one comment, permits new-expressions to not call a replaceable global allocation function. But vector allocates memory using std::allocator<T>::allocate, which calls ::operator new directly, not via a new-expression. So that special permission doesn't apply, and generally compilers cannot elide such direct calls to ::operator new.

    All hope is not lost, however, for std::allocator<T>::allocate's specification has this to say:

    Remarks: the storage is obtained by calling ​::​operator new, but it is unspecified when or how often this function is called.

    Leveraging this permission, libc++'s std::allocator uses special clang built-ins to indicate to the compiler that elision is permitted. With -stdlib=libc++, clang compiles your code down to

    foo():                                # @foo()
            mov     eax, 5
            ret
    
    0 讨论(0)
  • 2021-02-06 22:30

    The compiler can't optimise heap-related code, as heap-related code is run-time specific. The heap-related code is the use of a std::vector, which holds the managed data in heap memory.

    In your example all values as well as the size is known at compile-time, therefore it's possible to use std::array instead, which is initialized by aggregate initialization, and therefore may be constexpr-qualified.

    Changing your example using std::array reduces your function to your expected output:

    #include <array>
    
    int foo() {
        std::array<int,5> v {1, 2, 3, 4, 5};
        return v[4];
    }
    
    foo():                                # @foo()
            mov     eax, 5
            ret
    

    Using the given function will still result in a call to foo(). In order to eliminate the call, the function must be qualified as constexpr:

    #include <array>
    
    constexpr int foo() {
        constexpr std::array<int,5> v {1, 2, 3, 4, 5};
        return v[4];
    }
    
    int main() {
        return foo();
    }
    
    main:                                   # @main
            mov     eax, 5
            ret
    
    0 讨论(0)
  • 2021-02-06 22:34

    As the comments note, operator new can be replaced. This can happen in any Translation Unit. Optimizing a program for the case it's not replaced therefore requires Whole-Program Analysis. And if it is replaced, you have to call it of course.

    Whether the default operator new is a library I/O call is unspecified. That matters, because library I/O calls are observable and therefore they can't be optimized out either.

    0 讨论(0)
提交回复
热议问题