I\'m looking for a function that maps a multi-set of integers to an integer, hopefully with some kind of guarantee like pairwise independence.
Ideally, memory usage woul
For example 00001011 become 11010000. Then, just SUM all the reversed set elements.
If we need O(1) on insert/delete, the usual SUM will work (and that's how Sets are implemented in Java), though not well distributed over sets of small integers.
In case our set will not be uniformly distributed (as it usually is), we need mapping N->f(N), so that f(N) would be uniformly distributed for the expected data sample. Usually, data sample contains much more close-to-zero numbers than close-to-maximum numbers. In this case, reverse-bits hash would distribute them uniformly.
Example in Scala:
def hash(v: Int): Int = {
var h = v & 1
for (i <- 1 to 31) {
h <<= 1;
h |= ((v >>> i) & 1)
}
h
}
def hash(a: Set[Int]): Int = {
var h = 0
for (e: Int <- a) {
h += hash(e);
}
h
}
But the hash of our multi-set will not be uniform, though much better than simple SUM.