Using hash functions with Bloom filters

后端 未结 2 1825
情话喂你
情话喂你 2021-01-03 14:58

A bloom filter uses a hash function (or many) to generate a value between 0 and m given an input string X. My question is how to you use a hash function to generate a value

2条回答
  •  悲哀的现实
    2021-01-03 15:17

    You should first convert the hash output to an unsigned integer, then reduce it modulo m. This looks like this:

    MessageDigest md = MessageDigest.getInstance("MD5");
    // hash data...
    byte[] hashValue = md.digest();
    BigInteger n = new BigInteger(1, hashValue);
    n = n.mod(m);
    // at that point, n has a value between 0 and m-1 (inclusive)
    

    I have assumed that m is a BigInteger instance. If necessary, use BigInteger.valueOf(). Similarly, use n.intValue() or n.longValue() to get the value of n as one of the primitive types of Java.

    The modular reduction is somewhat biased, but the bias is very small if m is substantially smaller than 2^128.

提交回复
热议问题