HashSet of Strings taking up too much memory, suggestions…?

前端 未结 8 1803
闹比i
闹比i 2020-12-29 08:44

I am currently storing a list of words (around 120,000) in a HashSet, for the purpose of using as a list to check enetered words against to see if they are spelt correctly,

相关标签:
8条回答
  • 2020-12-29 09:43

    The problem is by design: Storing such a huge amount of words in a HashSet for spell-check-reasons isn't a good idea:

    You can either use a spell-checker (example: http://softcorporation.com/products/spellcheck/ ), or you can buildup a "auto-wordcompletion" with a prefix tree ( description: http://en.wikipedia.org/wiki/Trie ).

    There is no way to reduce memory-usage in this design.

    0 讨论(0)
  • 2020-12-29 09:47

    Check out bloom filters or cuckoo hashing. Bloom filter or cuckoo hashing?

    I am not sure if this is the answer for your question but worth looking into these alternatives. bloom filters are mainly used for spell checker kind of use cases.

    0 讨论(0)
提交回复
热议问题