I\'m trying to wrap my head around which data structures are the most efficient and when / where to use which ones.
Now, it could be that I simply just don\'t unders
If you are trying to get a structure as efficient as a Dictionary
but you dont know for sure there is no duplicate key in input, Lookup
is safer.
As mentioned in another answer, it also supports null keys, and returns always a valid result when queried with arbitrary data, so it appears as more resilient to unknown input (less prone than Dictionary to raise exceptions).
And it is especially true if you compare it to the System.Linq.Enumerable.ToDictionary
function :
// won't throw
new[] { 1, 1 }.ToLookup(x => x);
// System.ArgumentException: An item with the same key has already been added.
new[] { 1, 1 }.ToDictionary(x => x);
The alternative would be to write your own duplicate key management code inside of a foreach
loop.
If you don't need a list and you are going to manage a huge number of items, Dictionary
(or even your own custom tailored structure) would be more efficient:
Stopwatch stopwatch = new Stopwatch();
var list = new List<string>();
for (int i = 0; i < 5000000; ++i)
{
list.Add(i.ToString());
}
stopwatch.Start();
var lookup = list.ToLookup(x => x);
stopwatch.Stop();
Console.WriteLine("Creation: " + stopwatch.Elapsed);
// ... Same but for ToDictionary
var lookup = list.ToDictionary(x => x);
// ...
As Lookup
has to maintain a list of items for each key, it is slower than Dictionary (around 3x slower for huge number of items)
Lookup speed: Creation: 00:00:01.5760444
Dictionary speed: Creation: 00:00:00.4418833
Two significant differences:
Lookup
is immutable. Yay :) (At least, I believe the concrete Lookup
class is immutable, and the ILookup
interface doesn't provide any mutating members. There could be other mutable implementations, of course.)KeyNotFoundException
. (Hence there's no TryGetValue
, AFAICR.)They're likely to be equivalent in efficiency - the lookup may well use a Dictionary<TKey, GroupingImplementation<TValue>>
behind the scenes, for example. Choose between them based on your requirements. Personally I find that the lookup is usually a better fit than a Dictionary<TKey, List<TValue>>
, mostly due to the first two points above.
Note that as an implementation detail, the concrete implementation of IGrouping<,>
which is used for the values implements IList<TValue>
, which means that it's efficient to use with Count()
, ElementAt()
etc.
Interesting that nobody has stated the actual biggest difference (Taken directly from MSDN):
A Lookup resembles a Dictionary. The difference is that a Dictionary maps keys to single values, whereas a Lookup maps keys to collections of values.
Another difference not mentioned yet is that Lookup() supports null keys:
Lookup class implements the ILookup interface. Lookup is very similar to a dictionary except multiple values are allowed to map to the same key, and null keys are supported.
Both a Dictionary<Key, List<Value>>
and a Lookup<Key, Value>
logically can hold data organized in a similar way and both are of the same order of efficiency. The main difference is a Lookup
is immutable: it has no Add()
methods and no public constructor (and as Jon mentioned you can query a non-existent key without an exception and have the key as part of the grouping).
As to which do you use, it really depends on how you want to use them. If you are maintaining a map of key to multiple values that is constantly being modified, then a Dictionary<Key, List<Value>>
is probably better since it is mutable.
If, however, you have a sequence of data and just want a read-only view of the data organized by key, then a lookup is very easy to construct and will give you a read-only snapshot.
The primary difference between an ILookup<K,V>
and a Dictionary<K, List<V>>
is that a dictionary is mutable; you can add or remove keys, and also add or remove items from the list that is looked up. An ILookup
is immutable and cannot be modified once created.
The underlying implementation of both mechanisms will be either the same or similar, so their searching speed and memory footprint will be approximately the same.