发表新帖

发表新帖

What is a good hash function for a collection (i.e., multi-set) of integers?

后端未结

关注

 6  1241

不知归路 2021-02-05 06:03

I\'m looking for a function that maps a multi-set of integers to an integer, hopefully with some kind of guarantee like pairwise independence.

Ideally, memory usage woul

6条回答

攒了一身酷 (楼主)

2021-02-05 06:50
Reverse-bits.

For example 00001011 become 11010000. Then, just SUM all the reversed set elements.

If we need O(1) on insert/delete, the usual SUM will work (and that's how Sets are implemented in Java), though not well distributed over sets of small integers.

In case our set will not be uniformly distributed (as it usually is), we need mapping N->f(N), so that f(N) would be uniformly distributed for the expected data sample. Usually, data sample contains much more close-to-zero numbers than close-to-maximum numbers. In this case, reverse-bits hash would distribute them uniformly.

Example in Scala:
```
def hash(v: Int): Int = {
        var h = v & 1
        for (i <- 1 to 31) {
                h <<= 1;
                h |= ((v >>> i) & 1)
        }
        h
}
def hash(a: Set[Int]): Int = {
        var h = 0
        for (e: Int <- a) {
                h += hash(e);
        }
        h
}
```
But the hash of our multi-set will not be uniform, though much better than simple SUM.
0 讨论(0)

查看其它6个回答
发布评论:

提交评论
- 加载中...

热议问题