I have been using std::vector
a lot, and recently I asked myself this question: \"How is std::vector
implemented?\"
I had two alternatives:
It's implemented by using an underlying array.
It's not possible to implement a std::vector<T> with a linked list because the standard guarantees the elements in the list will be held in contiguous memory.
I believe it is the third option. It can't just use new T[n]
because then it would actually have to construct as many objects as it allocates. E.g
std::vector<Foo> v;
v.reserve(10);
If your implementation simply ended up doing new Foo[10]
then you'd just have constructed 10 instances of Foo.
Instead it uses its allocator to allocate and deallocate raw memory (without constructing objects), and as needed (for example, when you actually push_back
objects) places copy-constructed instances into correct memory locations in its reserve using placement new and removes them with explicit calls to the destructor (something you'd only do in combination with placement new). The allocator class provides following methods for that which I presume vector's implementations use
void construct(pointer p, const_reference val);
Returns:
new((void *)p) T(val)
void destroy(pointer p);
Returns:
((T*)p)->~T()
(The "returns" probably should read "effect" or similar.)
More about placement new
A pedagogic (and thus simplified) version of a container called "Vec" is discussed in Chapter 11 of the wonderful (introductory) book "Accelerated C++". What they describe is a stripped-down version of std::vector, but I think it is still worth noting that:
1) they implement their template class in terms of an array,
2) they discuss push_back in terms of the trick (mentioned above) of allocating more storage than is needed, and coming back for more when they run out, and
3) they use allocator<T
> for memory management. The new operator is not flexible enough in this context, since it both allocates and initializes memory.
I repeat, though, that this doesn't mean that actual implementations out there are this simple. But since "Accelerated C++" is quite widespread, those interested can find in the relevant chapter one way vector-like objects can be created, copied, assigned, and destroyed.
EDIT: On a related note, I just found the following blog post by Herb Sutter in which he comments on an earlier blog post by Andrew Koenig, regarding whether or not one should be worried about vector elements being contiguous in memory: Cringe not: Vectors are guaranteed to be contiguous.
There's no actual array at all in any decent implementation (if there is, you can't use any object in it without a default constructor), but just raw memory that gets allocated. It gets allocated in a manner that's usually along the lines of doubling every time you need to expand it.
The vector then uses in place allocation to call the constructors of the class in the proper location once each slot actually gets used actually used.
When there is expansion it will try to reallocate in place (but this is a bit silly and doesn't normally work, think windows 98 heap compaction) but usually will end up making a whole new allocation and copying over.
A standard stl vector is always all together, but not all implementations work like that (I know, having written some of them). Probably none are exactly a linked list, though, either.
From what i have read in books and from the functionality of reserve and and the requirement that elements of vectors be contiguous, This is what i think could be a possible way to implement Vector.
1) Elements of vectors be contiguous , supporting O(1) random access and vectors should be compatible with C arrays. This just implies there are no linked lists.
2) When you call reserve it reserves additional memory. But reserve does call
new T[newSize]
to reserve more memory. Otherwise it will call default constructor. As uncleben explained whenever reserve is called the vector class just allocates more uninitialized memory usin its allocator (if required) and copy construct new objects into that memory using placement new(if more memory has been allocated)
3) Initially vector has some default capacity. for which uninitialized memory is allocated when the vector object is constructed
4) push_back copy constructs the object into the first available location. If required more memory has to be allocated in similar manner as reserve
There is no one way it is implemented. Different implementations can be different, so long as the preserve the semantics and satisfy the requirements.
At any given time, there has to be a primitive array of T to satisfy the requirements of contiguity. However, how it is allocated, grown, shrunk, and freed is up to the implementor.
You can read the implementation for yourself, it's right there in the header file.
I can tell you that no implementations use linked lists. They aren't consistent with the requirements of the standard.