A space efficient data structure to store and look-up through a large set of (uniformly distributed) Integers

后端 未结 7 2163
悲哀的现实
悲哀的现实 2021-01-06 22:28

I\'m required to hold, in memory, and look-up through one million uniformly distributed integers. My workload is extremely look-up intensive.
My current implementation u

相关标签:
7条回答
  • 2021-01-06 23:11

    I think that you might reconsider original problem (having efficient word list), rather than trying to optimize the "optimalization".

    I would suggest looking into Radix tree/Trie.

    https://en.wikipedia.org/wiki/Radix_tree or https://en.wikipedia.org/wiki/Trie

    You are basically storing some kind of tree with prefixes of strings, branching every time there is a choice in dictionary. It has some interesting side effects (allows filtering on prefixes very efficiently), can save some memory for strings with longer common prefixes and is reasonably fast.

    Some example implementations:

    https://lucene.apache.org/core/4_0_0/analyzers-stempel/org/egothor/stemmer/Trie.html

    https://github.com/rkapsi/patricia-trie

    https://github.com/npgall/concurrent-trees

    There is interesting comparison of various implementations here, with bigger focus on performance rather than memory usage, but it can be still helpful

    http://bhavin.directi.com/to-trie-or-not-to-trie-a-comparison-of-efficient-data-structures/

    0 讨论(0)
提交回复
热议问题