Minimal perfect hash function

自古美人都是妖i 提交于 2019-11-29 03:07:51

问题


I have many integers in range [0; 2^63-1]. There is only 10^8 integers, however. There is no duplicates. Full list is known at compile-time but it is just unique random numbers. These numbers never changes.
To store one integer explicitly, 8 bytes required, and there is associated 1-byte values, so explicit storing requires about 860 MB.
So I want to find minimal perfect hash function to map each of 10^8 integers from [0;2^63-1] to [0;10^8-1]. I should find this function only once, data never changes, and function can be complicated. But it should be minimal, perfect, and calculating should be fast. How I can do this better? Maybe it is possible to find and use some subsequences if they happens?
Thanks.


回答1:


Let your computer do the work for you:

http://www.gnu.org/software/gperf/

Quote: "GNU gperf is a perfect hash function generator. For a given list of strings, it produces a hash function and hash table, in form of C or C++ code, for looking up a value depending on the input string. The hash function is perfect, which means that the hash table has no collisions, and the hash table lookup needs a single string comparison only. "




回答2:


I'm working on an algorithm and Java implementation that needs less than 1.6 bits per key.

Previously, I have implemented a minimal perfect hash function tool in Java that needs less than 2.0 bits per key.

Other algorithms are implemented in CMPH. For example CHD needs about 2.06 bits per key by default. It can be configured to use less space, but generation is then slower.



来源:https://stackoverflow.com/questions/6743316/minimal-perfect-hash-function

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!