Which is faster to find an item in a hashtable or in a sorted list?
Algorithm complexity is a good thing to know, and hashtables are known to be O(1) while a sorted vector (in your case I guess it is better to use a sorted array than a list) will provide O(log n) access time.
But you should know that complexity notation gives you the access time for N going to the infinite. That means that if you know that your data will keep growing, complexity notation gives you some hint on the algorithm to chose.
When you know that your data will keep a rather low length: for instance having only a few entries in your array/hashtable, you must go with your watch and measure. So have a test.
For instance, in another problem: sorting an array. For a few entries bubble sort while O(N^2) may be quicker than .. the quick sort, while it is O(n log n).
Also, accordingly to other answers, and depending on your item, you must try to find the best hash function for your hashtable instance. Otherwise it may lead to dramatic bad performance for lookup in your hashtable (as pointed out in Hank Gay's answer).
Edit: Have a look to this article to understand the meaning of Big O notation .
It depends entirely on the amount of data you have stored.
Assuming you have enough memory to throw at it (so the hash table is big enough), the hash table will locate the target data in a fixed amount of time, but the need to calculate the hash will add some (also fixed) overhead.
Searching a sorted list won't have that hashing overhead, but the time required to do the work of actually locating the target data will increase as the list grows.
So, in general, a sorted list will generally be faster for small data sets. (For extremely small data sets which are frequently changed and/or infrequently searched, an unsorted list may be even faster, since it avoids the overhead of doing the sort.) As the data set becomes large, the growth of the list's search time overshadows the fixed overhead of hashing, and the hash table becomes faster.
Where that breakpoint is will vary depending on your specific hash table and sorted-list-search implementations. Run tests and benchmark performance on a number of typically-sized data sets to see which will actually perform better in your particular case. (Or, if the code already runs "fast enough", don't. Just use whichever you're more comfortable with and don't worry about optimizing something which doesn't need to be optimized.)
In some cases, it depends on the size of the collection (and to a lesser degree, implementation details). If your list is very small, 5-10 items maybe, I'd guess the list would be faster. Otherwise xtofl has it right.
Unless the hashing algorithm is extremely slow (and/or bad), the hashtable will be faster.
UPDATE: As commenters have pointed out, you could also be getting degraded performance from too many collisions not because your hash algorithm is bad but simply because the hashtable isn't big enough. Most library implementations (at least in high-level languages) will automatically grow your hashtable behind the scenes—which will cause slower-than-expected performance on the insert that triggers the growth—but if you're rolling your own, it's definitely something to consider.
The get
operation in a SortedList
is O(log n)
while the same operation e a HashTable is O(1)
. So, normally, the HashTable
would be much faster. But this depends on a number of factors:
Assuming that by 'sorted list' you mean 'random-accessible, sorted collection'. A list has the property that you can only traverse it element by element, which will result in a O(N) complexity.
The fastest way to find an element in a sorted indexable collection is by N-ary search, O(logN), while a hashtable without collissions has a find complexity of O(1).