Is there a specific data structure that a deque in the C++ STL is supposed to implement, or is a deque just this vague notion of an array growable from both the front and th
It's implementation specific. All a deque requires is constant time insertion/deletion at the start/end, and at most linear elsewhere. Elements are not required to be contiguous.
Most implementations use what can be described as an unrolled list. Fixed-sized arrays get allocated on the heap and pointers to these arrays are stored in a dynamically sized array belonging to the deque.
A deque is typically implemented as a dynamic array of arrays of T
.
(a) (b) (c) (d)
+-+ +-+ +-+ +-+
| | | | | | | |
+-+ +-+ +-+ +-+
^ ^ ^ ^
| | | |
+---+---+---+---+
| 1 | 8 | 8 | 3 | (reference)
+---+---+---+---+
The arrays (a), (b), (c) and (d) are generally of fixed capacity, and the inner arrays (b) and (c) are necessarily full. (a) and (d) are not full, which gives O(1) insertion at both ends.
Imagining that we do a lot of push_front
, (a) will fill up, when it's full and an insertion is performed we first need to allocate a new array, then grow the (reference) vector and push the pointer to the new array at the front.
This implementation trivially provides:
min(distance(begin, it), distance(it, end))
(the Standard is slightly more stringent that what you required)However it fails the requirement of amortized O(1) growth. Because the arrays have fixed capacity whenever the (reference) vector needs to grow, we have O(N/capacity) pointer copies. Because pointers are trivially copied, a single memcpy
call is possible, so in practice this is mostly constant... but this is insufficient to pass with flying colors.
Still, push_front
and push_back
are more efficient than for a vector
(unless you are using MSVC implementation which is notoriously slow because of very small capacity for the arrays...)
Honestly, I know of no data structure, or data structure combination, that could satisfy both:
and
I do know a few "near" matches:
deque
A deque<T>
could be implemented correctly by using a vector<T*>
. All the elements are copied onto the heap and the pointers stored in a vector. (More on the vector later).
Why T*
instead of T
? Because the standard requires that
"An insertion at either end of the deque invalidates all the iterators to the deque, but has no effect on the validity of references to elements of the deque."
(my emphasis). The T*
helps to satisfy that. It also helps us to satisfy this:
"Inserting a single element either at the beginning or end of a deque always ..... causes a single call to a constructor of T."
Now for the (controversial) bit. Why use a vector
to store the T*
? It gives us random access, which is a good start. Let's forget about the complexity of vector for a moment and build up to this carefully:
The standard talks about "the number of operations on the contained objects.". For deque::push_front
this is clearly 1 because exactly one T
object is constructed and zero of the existing T
objects are read or scanned in any way. This number, 1, is clearly a constant and is independent of the number of objects currently in the deque. This allows us to say that:
'For our deque::push_front
, the number of operations on the contained objects (the Ts) is fixed and is independent of the number of objects already in the deque.'
Of course, the number of operations on the T*
will not be so well-behaved. When the vector<T*>
grows too big, it'll be realloced and many T*
s will be copied around. So yes, the number of operations on the T*
will vary wildly, but the number of operations on T
will not be affected.
Why do we care about this distinction between counting operations on T
and counting operations on T*
? It's because the standard says:
All of the complexity requirements in this clause are stated solely in terms of the number of operations on the contained objects.
For the deque
, the contained objects are the T
, not the T*
, meaning we can ignore any operation which copies (or reallocs) a T*
.
I haven't said much about how a vector would behave in a deque. Perhaps we would interpret it as a circular buffer (with the vector always taking up its maximum capacity()
, and then realloc everything into a bigger buffer when the vector is full. The details don't matter.
In the last few paragraphs, we have analyzed deque::push_front
and the relationship between the number of objects in the deque already and the number of operations performed by push_front on contained T
-objects. And we found they were independent of each other. As the standard mandates that complexity is in terms of operations-on-T
, then we can say this has constant complexity.
Yes, the Operations-On-T*-Complexity is amortized (due to the vector
), but we're only interested in the Operations-On-T
-Complexity and this is constant (non-amortized).
Epilogue: the complexity of vector::push_back or vector::push_front is irrelevant in this implementation; those considerations involve operations on T*
and hence is irrelevant.
My understanding of deque
It allocates 'n' empty contiguous objects from the heap as the first sub-array. The objects in it are added exactly once by the head pointer on insertion.
When the head pointer comes to the end of an array, it allocates/links a new non-contiguous sub-array and adds objects there.
They are removed exactly once by the tail pointer on extraction. When the tail pointer finishes a sub-array of objects, it moves on to the next linked sub-array, and deallocates the old.
The intermediate objects between the head and tail are never moved in memory by deque.
A random access first determines which sub-array has the object, then access it from it's relative offset with in the subarray.
(Making this answer a community-wiki. Please get stuck in.)
First things first: A deque
requires that any insertion to the front or back shall keep any reference to a member element valid. It's OK for iterators to be invalidated, but the members themselves must stay in the same place in memory. This is easy enough by just copying the members to somewhere on the heap and storing T*
in the data structure under the hood. See this other StackOverflow question " About deque<T>'s extra indirection "
(vector
doesn't guarantee to preserve either iterators or references, whereas list
preserves both).
So let's just take this 'indirection' for granted and look at the rest of the problem. The interesting bit is the time to insert or remove from the beginning or end of the list. At first, it looks like a deque
could trivially be implemented with a vector
, perhaps by interpreting it as a circular buffer.
BUT A deque must satisfy "Inserting a single element either at the beginning or end of a deque always takes constant time and causes a single call to a constructor of T."
Thanks to the indirection we've already mentioned, it's easy to ensure there is just one constructor call, but the challenge is to guarantee constant time. It would be easy if we could just use constant amortized time, which would allow the simple vector
implementation, but it must be constant (non-amortized) time.
This is an answer to user gravity's challenge to comment on the 2-array-solution.
Discussion of details: The user "gravity" has already given a very neat summary. "gravity" also challenged us to comment on the suggestion of balancing the number of elements between two arrays in order to achieve O(1) worst case (instead of average case) runtime. Well, the solution works efficiently if both arrays are ringbuffers, and it appears to me that it is sufficient to split the deque into two segments, balanced as suggested. I also think that for practical purposes the standard STL implementation is at least good enough, but under realtime requirements and with a properly tuned memory management one might consider using this balancing technique. There is also a different implementation given by Eric Demaine in an older Dr.Dobbs article, with similar worst case runtime.
Balancing the load of both buffers requires to move between 0 or 3 elements, depending on the situation. For instance, a pushFront(x) must, if we keep the front segment in the primary array, move the last 3 elements from the primary ring to the auxiliary ring in order to keep the required balance. A pushBack(x) at the rear must get hold of the load difference and then decide when it is time to move one element from the primary to the auxiliary array.
Suggestion for improvement: There is less work and bookkeeping to do if front and rear are both stored in the auxiliary ring. This can be achieved by cutting the deque into three segments q1,q2,q3, arranged in the following manner: The front part q1 is in the auxiliary ring (the doubled-sized one) and may start at any offset from which the elements are arranged clockwise in subsequent order. The number of elements in q1 are exactly half of all elements stored in the auxiliary ring. The rear part q3 is also in the auxilary ring, located exactly opposite to part q1 in the auxilary ring, also clockwise in subsequent order. This invariant has to be kept between all deque operations. Only the middle part q2 is located (clockwise in subsequent order) in the primary ring.
Now, each operation will either move exactly one element, or allocate a new empty ringbuffer when either one gets empty. For instance, a pushFront(x) stores x before q1 in the auxilary ring. In order to keep the invariant, we move the last element from q2 to the front of the rear q3. So both, q1 and q3 get an additional element at their fronts and thus stay opposite to each other. PopFront() works the other way round, and the rear operations work the same way. The primary ring (same as the middle part q2) goes empty exactly when q1 and q3 touch each other and form a full circle of subsequent Elements within the auxiliary ring. Also, when the deque shrinks, q1,q3 will go empty exactly when q2 forms a proper circle in the primary ring.