I played around with Godbolt\'s CompilerExplorer. I wanted to see how good certain optimizations are. My minimum working example is:
#include
int
std::vector<T>
is a fairly complicated class that involves dynamic allocation. While clang++
is sometimes able to elide heap allocations, it is a fairly tricky optimization and you should not rely on it. Example:
int foo() {
int* p = new int{5};
return *p;
}
foo(): # @foo() mov eax, 5 ret
As an example, using std::array<T>
(which does not dynamically allocate) produces fully-inlined code:
#include <array>
int foo() {
std::array v{1, 2, 3, 4, 5};
return v[4];
}
foo(): # @foo() mov eax, 5 ret
As Marc Glisse noted in the other answer's comments, this is what the Standard says in [expr.new] #10:
An implementation is allowed to omit a call to a replaceable global allocation function ([new.delete.single], [new.delete.array]). When it does so, the storage is instead provided by the implementation or provided by extending the allocation of another new-expression. The implementation may extend the allocation of a new-expression e1 to provide storage for a new-expression e2 if the following would be true were the allocation not extended: [...]
N3664's change to [expr.new], cited in one answer and one comment, permits new-expressions to not call a replaceable global allocation function. But vector
allocates memory using std::allocator<T>::allocate
, which calls ::operator new
directly, not via a new-expression. So that special permission doesn't apply, and generally compilers cannot elide such direct calls to ::operator new
.
All hope is not lost, however, for std::allocator<T>::allocate
's specification has this to say:
Remarks: the storage is obtained by calling
::operator new
, but it is unspecified when or how often this function is called.
Leveraging this permission, libc++'s std::allocator
uses special clang built-ins to indicate to the compiler that elision is permitted. With -stdlib=libc++
, clang compiles your code down to
foo(): # @foo()
mov eax, 5
ret
The compiler can't optimise heap-related code, as heap-related code is run-time specific. The heap-related code is the use of a std::vector, which holds the managed data in heap memory.
In your example all values as well as the size is known at compile-time, therefore it's possible to use std::array instead, which is initialized by aggregate initialization, and therefore may be constexpr-qualified.
Changing your example using std::array
reduces your function to your expected output:
#include <array>
int foo() {
std::array<int,5> v {1, 2, 3, 4, 5};
return v[4];
}
foo(): # @foo()
mov eax, 5
ret
Using the given function will still result in a call to foo()
. In order to eliminate the call, the function must be qualified as constexpr
:
#include <array>
constexpr int foo() {
constexpr std::array<int,5> v {1, 2, 3, 4, 5};
return v[4];
}
int main() {
return foo();
}
main: # @main
mov eax, 5
ret
As the comments note, operator new
can be replaced. This can happen in any Translation Unit. Optimizing a program for the case it's not replaced therefore requires Whole-Program Analysis. And if it is replaced, you have to call it of course.
Whether the default operator new
is a library I/O call is unspecified. That matters, because library I/O calls are observable and therefore they can't be optimized out either.