Why are Stack and Queue implemented with an array?

问题

I'm reading C# 4.0 in a Nutshell by the Albahari brothers and I came across this:

Stacks are implemented internally with an array that's resized as required, as with Queue and List. (pg 288, paragraph 4)

I can't help but wonder why. LinkedList provides O(1) head and tail inserts and deletes (which should work well for a stack or queue). A resizable array has O(1) amortized insert (if I remember right), but O(n) worst case (I'm not sure about delete). And it probably uses more space than the linked list (for large stacks/queues).

Is there more to it than that? What is the downside to a doubly linked list implementation?

回答1:

but O(n) worst case

The amortized worst case is still O(1). Long and short insertion times average out – that’s the whole point of amortised analysis (and the same for deletion).

An array also uses less space than a linked list (which after all has to store an additional pointer for each element).

Furthermore, the overhead is just much less than with a linked list. All in all, an array-based implementation is just (much) more efficient for almost all use-cases, even though once in a while an access will take a little longer (in fact, a queue can be implemented slightly more efficiently by taking advantage of pages that are themselves managed in a linked list – see C++’ std::deque implementation).

回答2:

Here's a rough guestimation of the memory resources used for a stack of 100 System.Int32s:

An array implementation would require the following:

type designator                          4 bytes
object lock                              4
pointer to the array                     4 (or 8)
array type designator                    4
array lock                               4
int array                              400
stack head index                         4
                                       ---
Total                                  424 bytes  (in 2 managed heap objects)

A linked list implementation would require the following:

type designator                          4 bytes
object lock                              4
pointer to the last node                 4 (or 8)
node type designator         4 * 100 = 400
node lock                    4 * 100 = 400
int value                    4 * 100 = 400
pointer to next node  4 (or 8) * 100 = 400 (or 800)
                                     -----
Total                                1,612 bytes  (in 101 managed heap objects)

The main down-side of the array implementation would be the act of copying the array when it needs to be expanded. Ignoring all other factors, this would be a O(n) operation where n is the number of items in the stack. This seems like a pretty bad thing except for two factors: it hardly ever happens, since the expansion is doubled at each increment, and the array copy operation is highly optimized and is amazing fast. Thus the expansion is, in practice, easily swamped by other stack operations.

Similarly for the queue.

回答3:

This is because .NET was designed to run on modern processors. Which are much, much faster than the memory bus. The processor runs at around 2 gigahertz. The RAM in your machine is clocked at typically a couple of hundred megahertz. Reading a byte from RAM takes well over a hundred clock cycles.

Which makes the CPU caches very important on modern processors, a large amount of chip real-estate is burned on making the caches as big as possible. Typical today is 64 KB for the L1 cache, the fastest memory and physically located very close to the processor core, 256 KB for the L2 cache, slower and further away from the core, around 8 MB for the L3 cache, slower yet and furthest away, shared by all the cores on the chip.

To make the caches effective, it is very important to access memory sequentially. Reading the first byte can be very expensive if an L3 or RAM memory access is necessary, the next 63 bytes are very cheap. The size of the "cache line", the unit of data transfer for the memory bus.

This makes an array by far the most effective data structure, its elements are stored sequentially in memory. And a linked list by far the worst possible data structure, its elements are naturally scattered through memory, potentially incurring the very expensive cache miss for each element.

Accordingly, all .NET collections, except LinkedList<> are implemented as arrays internally. Do note that a Stack<> is already naturally implemented as an array since you only can push and pop an element from the end of the array. An O(1) operation. Resizing the array is amortized O(logN).

来源：https://stackoverflow.com/questions/3000410/why-are-stackt-and-queuet-implemented-with-an-array

标签