I have a set of uint32
integers, there may be millions of items in the set. 50-70% of them are consecutive, but in input stream they appear in unpredictable ord
You can use y-fast trees or van Emde Boas trees to achieve O(lg w) time queries, where w is the number of bits in a word, and you can use fusion trees to achieve O(lg_w n) time queries. The optimal tradeoff in terms of n is O(sqrt(lg(n))).
The easiest of these to implement is probably y-fast trees. They are probably faster than doing binary search, though they require roughly O(lg w) = O(lg 32) = O(5) hash table queries, while binary search requires roughly O(lg n) = O(lg 10000) = O(13) comparisons, so binary search may be faster.