Does anyone know a good resource to concisely explain the different types of lists available in C# and when their usage is appropriate?
For example, List, Hashtable,
In addition to the great answers so far, there are some more Collections available via the C5 Generic Collection Library. The documentation (also on their site) may help when deciding what to use depending on your requirments.
If you start at the MSDN doco for System.Collections, you can drill down into the individual collection types for more details about those "lists" and how to use them. For example, the doco for Hashtable
says, "Represents a collection of key/value pairs that are organized based on the hash code of the key."
There's also a nice discussion of System.Collections.Generic in Understanding Generics.
These aren't all lists, although they're all collections. Here's a quick summary.
Non-generic collections (API is in terms of object
. Values types are boxed.
These are mostly in the System.Collections namespace:
Generic collections. (Strongly-typed API, will not box value types (assuming suitable T).
These are mostly in the System.Collections.Generic namespace:
Possibly the most important collection interface is IEnumerable (and IEnumerable<T>). This represents a sequence of items much like a Stream represents a sequence of bytes. There is no random access, just forward-reading. LINQ to Objects is based on this, and pretty much all collection types implement it.
These are examples of various types of general data structures. These data structures are used all over the place in software engineering.
To expound on tobsen's earlier answer, the C5 Generic Collection Library has a large number of, well, collections. I'll describe some of them here:
Queue/Stack
CircularQueue<T>
: This class provides strictly Queue and Stack functionality. As well, efficient O(1) access to any item in the Stack/Queue is available using the indexer: cq[0]
(where 0 is the oldest item, next to be dequeued, last to be popped).Lists
Note: ArrayList
and LinkedList
can also function as Queue/Stacks
ArrayList<T>
: Similar to its counterpart in System.Collections.Generic (SCG)
, List<T>
, this is backed by an array, guaranteeing O(1) indexing, but worst-case O(n) insertion.O(n) to find an item.LinkedList<T>
: Similar to its counterpart SCG.LinkedList<T>
. Using a doubly-linked list, guarantees O(1) insertion, but worst-case O(n) indexing (in practice, is proportional to distance from either head or tail of the list). Also O(n) to find an item. Sorting uses a stable Merge Sort.HashedArrayList<T>
: Similar to the ArrayList<T>
above, but does not allow duplicates. The benefit you get in return is that the time to find an item and its index is reduced to O(1).HashedLinkedList<T>
: Similar to the LinkedList<T>
above, but does not allow duplicates. As before, the time to find an item is reduced to O(1), but time to find its index remains O(n).WrappedArray<T>
: Fairly similar to the ArrayList<T>
, this acts as a wrapper around an array that implements C5.IList<T>
, but throws exceptions if an attempt is made to modify the collection (IsFixedSize
is true and Add
, Remove
, Insert
don't work; Sort
, Shuffle
, and Reverse
do, however, as they are in-place operations).Lists also provide a "View" functionality which represents a segment of the underlying list, allowing local operations to be performed. Using patterns offered in the C5 book, operations can be performed using views that are efficient on both array and linked lists. Any list operation can also be performed on a view, restricting their effect to a subset of the underlying list.
Sorted Collections
SortedArray<T>
: Similar to an ArrayList<T>
except that it keeps its items sorted and does not allow duplicates. Note that random insertions and deletions on this collection are slow. This collection is best if the number of items is small or rarely modified but often accessed by item index or value.TreeSet<T>
: Uses a red-black tree structure to keep items sorted. As a set, it does not allow duplicates. Access by index or item value and insertion/deletion take O(log n).TreeBag<T>
: Uses a red-black tree, keeping items sorted. As a bag, it does allow duplicates, but does not store duplicates in the tree, rather keeping duplicates by counting.Both TreeSet<T>
and TreeBag<T>
provide the ability to efficiently make "snapshots" or persistent copies of the tree in O(1), allowing iteration over the snapshot while modifying the underlying tree. Note that each snapshot on a tree adds a performance penalty to updates to the tree, but these effects go away when the snapshot is disposed.
Hash Collections
HashSet<T>
: A collection using a simple hash table for storage. Access by item value takes O(1). As a set, it does not allow duplicates. Provides a function BucketCostDistribution()
that can help tell you determine the efficiency of the items' hashcode function.HashBag<T>
: Similar to the HashSet<T>
, but has bag semantics, meaning that duplicates are allowed, but duplicates are only stored by counting.Priority Queue
IntervalHeap<T>
: Provides a priority queue. Finding the maximum and minimum are O(1) operations, deleting the maximum, minimum, adding, and updating are O(log n) operations. Allows duplicates by storing them explicitly (not by counting).Dictionaries
HashDictionary<H,K>
: Similar to the SCG.Dictionary<H,K>
, provides entry access, insertion, and deletion in O(1). Also provides a BucketCostDistribution()
function as HashSet<T>
above. Does not guarantee any particular enumeration order.TreeDictionary<H,K>
: Similar to the SCG.SortedDictionary<H,K>
, provides a persistently sorted dictionary using a red-black tree. Entry access, insertion, and deletion take O(log n). Guarantees that enumeration of the dictionary follows the order specified by the key comparer.Guarded Collections
As well, C5 also offers "guarded" collections, which effectively acts as a read-only wrapper, preventing the collection from being modified. Items in the collection still may be modified, but items can't be added, deleted, or inserted into the collection.
A long answer, but thorough on the C5 libraries various collections at your disposal. I have found the C5 library to be great and often use it in my own code, replacing the common C# header with:
using C5;
using SCG = System.Collections.Generic;
Hash maps
This is a data structure that allows you to keep key-value pairs. Given a Key that has some way of being ordered, you can insert a Value. A simple example could be a list of students where the Key is the student ID, and the value the student name.
Random access lists
Random access lists are used to store a long list of objects that are to be accessed randomly (i.e. you want to access the n'th element in O(1) time). It is not good if you want to insert/delete elements in the middle of the list, since that would require the entire list to be shuffled around which could take some time.
Linked lists and similar
Linked lists are great if you don't want to access elements in the middle, since that would take O(N) time. It's great if you want to insert/remove elements in the middle since it only involves changing a few pointers.
Queues and Stacks are slightly specialised, in that they are optimised for FIFO and FILO behaviour (First-In-First-Out and First-In-First-Out respectively).