Very low cost hash function

后端 未结 5 1975
死守一世寂寞
死守一世寂寞 2020-12-25 08:33

I need a hash function for a Look Up table, so that if my values are from 0 to N, I need a hash function that give me a value from 0 to n, being n << N. Another piece

相关标签:
5条回答
  • 2020-12-25 09:10

    CRC?

    There is already a lot of hardware support for this too.

    0 讨论(0)
  • 2020-12-25 09:25

    The canonical form of that is h(x) = (a*x + b) mod n, where a and b are constants and n is the size of your hash table. You want to make n a prime number, to get optimal(ish) distribution.

    Note that this is sensitive to certain kind of distributions -- for example, just doing x mod n is mostly relying on randomness of low-order bits; if they are not random in your set, you will get fairly significant skew.

    Bob Jenkins has designed several very good hashing functions; here's one specifically designed to be simple to implement in hardware: http://burtleburtle.net/bob/hash/nandhash.html

    For a lot of different hash functions, design discussions, etc, see the rest of the site: http://burtleburtle.net/bob/hash/

    0 讨论(0)
  • 2020-12-25 09:26

    I believe this is the best possible hash for this problem (faster than modulo, better distribution), given that all your numbers in 0..N have the same probability:

    h = z * n / N;
    

    Where all values are integers, so you have an integer division. This way each value between 0..N is mapped to exactly the same number of values in n.

    For example, when n=3 and N=7 (values 3 and 7 not included in the ranges), the hashes are this:

    z * n / N = hash
    ----------------
    0 * 3 / 7 = 0
    1 * 3 / 7 = 0
    2 * 3 / 7 = 0
    3 * 3 / 7 = 1
    4 * 3 / 7 = 1
    5 * 3 / 7 = 2
    6 * 3 / 7 = 2
    

    So each hash values are used equally often, just off by 1. Just take care that n*(N-1) does not overflow.

    If N is a power of 2, you can replace the division by shifting. e.g. if N=256:

    h = (z * n) >> 8;
    
    0 讨论(0)
  • 2020-12-25 09:32

    If you're truly talking hardware (vs. software, or hardware implementation of software), and your number of hash buckets n can be written as n = 2m - 1, the easiest is probably a maximum-length linear feedback shift register (LFSR) of which CRC is an instance.

    Here's one way you could use an m-bit shift register to create a hash of a data packet (make sure all data is represented consistently as a K-bit string, if you have shorter strings then pad one end with zeros):

    1. Initialize the state of the LFSR (CRC-32 uses all 1's; all zeros is probably bad)
    2. Shift in the bits of your data
    3. (Optional) Shift in an additional j zeros (j between m and 2m is probably a good choice); this adds some additional hashing to reduce direct correlation between input/output bits
    4. Use the contents of the m-bit shift register as your hashed value.
    0 讨论(0)
  • 2020-12-25 09:34

    Rewire bits in random order and take lower log2(n) bits

    Or just take lower log2(n) bits if your data is evenly distributed.

    0 讨论(0)
提交回复
热议问题