I\'m looking for a function that maps a multi-set of integers to an integer, hopefully with some kind of guarantee like pairwise independence.
Ideally, memory usage woul
Min-hashing should work here. Apply permutation, maintain a small multiset of n minimal elements, pick the biggest.
Elaborating: this is a simple way to work in O(1) time and space. You need something like a priority queue, without making the link to the initial values too obvious. So you order your priority queue according to some elaborate key, which is equivalent to running a priority queue on a permutation of the normal sort order. Make the queue keep track of multiplicity so that the selected elements also form a multiset.
That said, I'm not sure this disperses well enough (and running multiple permutations might become costly), so maybe build on Bradley's answer instead. Here is a tweak so that repeated elements don't cancel out:
xor(int_hash(x_n, multiplicity_n) foreach n)