I have an application that is performing some processing on some images.
Given that I know the width/height/format etc. (I do), and thinking just about defining a buffer
I would avoid std::vector as a container for storing an unstructured buffer, as std::vector is profoundly slow when used as a buffer
Consider this (C++14) example (for C++11, you can used shared instead of unique ptrs, but you'll notice slight performance hit in the array example that you don't get from the vectors when running at -O3 or -O2):
#include
#include
#include
#include
#include
#include
namespace {
std::unique_ptr> allocateWithPtr() {
return std::make_unique>();
}
std::vector allocateWithVector() {
return std::vector(4000000);
}
} // namespace
int main() {
auto start = std::chrono::system_clock::now();
for (long i = 0; i < 1000; i++) {
auto myBuff = allocateWithPtr();
}
auto ptr_end = std::chrono::system_clock::now();
for (long i = 0; i < 1000; i++) {
auto myBuff = allocateWithVector();
}
auto vector_end = std::chrono::system_clock::now();
std::cout << "std::unique_ptr = " << (ptr_end - start).count() / 1000.0
<< " ms." << std::endl;
std::cout << "std::vector = " << (vector_end - ptr_end).count() / 1000.0
<< " ms." << std::endl;
}
Output:
bash % clang++ -O3 -std=gnu++14 test.cpp && ./a.out
std::unique_ptr = 0 ms.
std::vector = 0 ms
bash % clang++ -O2 -std=gnu++14 test.cpp && ./a.out
std::unique_ptr = 0 ms.
std::vector = 0 ms.
bash % clang++ -O1 -std=gnu++14 test.cpp && ./a.out
std::unique_ptr = 89.945 ms.
std::vector = 14135.3 ms.
bash % clang++ -O0 -std=gnu++14 test.cpp && ./a.out
std::unique_ptr = 80.945 ms.
std::vector = 67521.1 ms.
Even with no writes or reallocations, std::vector is over 800 times slower than just using a new with a unique_ptr at -O0 and 150 times slower at -O1. What's going on here?
As @MartinSchlott points out, it is not designed for this task. A vector is for holding a set object instances, not an unstructured (from an array standpoint) buffer. Objects have destructors and constructors. When the vector is destroyed, it calls the destructor for each element in it, even vector will call a destructor for each char in your vector.
You can see how much time it takes just to "destroy" the unsigned chars in this vector with this example:
#include
#include
#include
#include
#include
std::vector allocateWithVector() {
return std::vector(4000000); }
}
int main() {
auto start = std::chrono::system_clock::now();
for (long i = 0; i < 100; i++) {
auto leakThis = new std::vector(allocateWithVector());
}
auto leak_end = std::chrono::system_clock::now();
for (long i = 0; i < 100; i++) {
auto myBuff = allocateWithVector();
}
auto vector_end = std::chrono::system_clock::now();
std::cout << "leaking vectors: = "
<< (leak_end - start).count() / 1000.0 << " ms." << std::endl;
std::cout << "destroying vectors = "
<< (vector_end - leak_end).count() / 1000.0 << " ms." << std::endl;
}
Output:
leaking vectors: = 2058.2 ms.
destroying vectors = 3473.72 ms.
real 0m5.579s
user 0m5.427s
sys 0m0.135s
Even when removing the destruction of the vector, it's still taking 2 seconds to just construct 100 of these things.
If you don't need dynamic resizing, or construction & destruction of the elements making up your buffer, don't use std::vector.