Optimizing Lookups: Dictionary key lookups vs. Array index lookups

青春壹個敷衍的年華 提交于 2019-11-28 22:41:15

Don't forget that Big-O notations only says how the complexity grows with respect to the size (etc) - it doesn't give any indication of the constant factors involved. That's why sometimes even a linear search for keys is faster than a dictionary lookup, when there are sufficiently few keys. In this case you're not even doing a search with the array though - just a straight indexing operation.

For straight index lookups, arrays are basically ideal - it's just a case of

pointer_into_array = base_pointer + offset * size

(And then a pointer dereference.)

Performing a dictionary lookup is relatively complicated - very fast compared with (say) a linear lookup by key when there are lots of keys, but much more complicated than a straight array lookup. It has to calculate the hash of the key, then work out which bucket that should be in, possibly deal with duplicate hashes (or duplicate buckets) and then check for equality.

As always, choose the right data structure for the job - and if you really can get away with just indexing into an array (or List<T>) then yes, that will be blindingly fast.

Is this type of behavior expected (performance decrease by a factor of 8)?

Why not? Each array lookup is almost intantaneous/negligeable, whereas a dictionary lookup may need at least an extra subroutine call.

The point of their both being O(1) means that even if you have 50 times more items in each collection, the performance decrease is still only a factor of whatever it is (8).

Something could take a millenium, and still be O(1).

If you single-step through this code in the disassembly window, you will quickly come to understand what the difference is.

Dictionary structures are most useful when the key space is very large and cannot be mapped into a stable, sequenced order. If you can convert your keys into a simple integer in a relatively small range, you will be hard-pressed to find a data structure that will perform better than an array.

On an implementation note; in .NET, dictionaries are essentially hashables. You can somewhat improve their key-lookup performance by ensuring that your keys hash into a large space of unique values. It looks like in your case, you are using a simple integer as a key (which I believe hashes to its own value) - so that may be the best you can do.

An array lookup is about the fastest thing you can do - essentially all it is is a single bit of pointer arithmetic to go from the start of the array to the element you wanted to find. On the other hand, the dictionary lookup is likely to be somewhat slower since it needs to do hashing and concern itself with finding the correct bucket. Although the expected runtime is also O(1) - the algorithmic constants are greater so it will be slower.

Welcome to Big-O notation. You always have to consider that there is a constant factor involved.

Doing one Dict-Lookup is of course much more expensive than an array lookup.

Big-O only tells you how algorithms scale. Double the amount of lookups and see how the numbers change: Both should take around the twice time.

The cost of retrieving an element from a Dictionary is O(1), but that's because a dictionary is implemented as a hashtable - so you have to first calculate the hash value to know which element to return. Hashtables are often not that efficient - but they are good for large datasets, or datasets that have a lot of unique-hash values.

The List (apart from being a rubbish word used to dercribe an array rather than a linked list!) will be faster as it will return the value by directly calculating the element you want returned.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!