Using hash functions with Bloom filters

后端 未结 2 1826
情话喂你
情话喂你 2021-01-03 14:58

A bloom filter uses a hash function (or many) to generate a value between 0 and m given an input string X. My question is how to you use a hash function to generate a value

相关标签:
2条回答
  • 2021-01-03 15:17

    You should first convert the hash output to an unsigned integer, then reduce it modulo m. This looks like this:

    MessageDigest md = MessageDigest.getInstance("MD5");
    // hash data...
    byte[] hashValue = md.digest();
    BigInteger n = new BigInteger(1, hashValue);
    n = n.mod(m);
    // at that point, n has a value between 0 and m-1 (inclusive)
    

    I have assumed that m is a BigInteger instance. If necessary, use BigInteger.valueOf(). Similarly, use n.intValue() or n.longValue() to get the value of n as one of the primitive types of Java.

    The modular reduction is somewhat biased, but the bias is very small if m is substantially smaller than 2^128.

    0 讨论(0)
  • 2021-01-03 15:18

    Simplest way would probably be to just convert the hash output (as a byte sequence) to a single binary number and take that modulo m.

    0 讨论(0)
提交回复
热议问题