Hash function on list independant of order of items in it

后端未结

关注

 9  830

无人共我 2021-01-19 21:54

I want to have a dictionary that assigns a value to a set of integers.

For example key is [1 2 3] and value will have certain

9条回答

执笔经年 (楼主)

2021-01-19 22:38
Basically all of the approaches here are instantiations of the same template. Map x₁, …, x_n to f(x₁) op … op f(x_n), where op is a commutative associative operation on some set X, and f is a map from items to X. This template has been used a couple of times in ways that are provably good.
- Choose a random large prime p and a random residue b in [1, p - 1]. Let f(x) = b^x mod p and let op be addition. We essentially interpret a set as a polynomial and use the Schwartz–Zippel lemma to bound the probability of a collision (= the probability that a nonzero polynomial has b as a root mod p).
- Let op be XOR and let f be a randomly chosen table. This is Zobrist hashing and minimizes in expectation the number of collisions by straightforward linear-algebraic arguments.
Modular exponentiation is slow, so don't use it. As for Zobrist hashing, with 3 million items, the table f probably won't fit into L2, though it does set an upper bound of one main-memory access.

I would instead take Zobrist hashing as a departure point and look for a cheap function f that behaves like a random function. This is essentially the job description of a non-cryptographic pseudorandom generator – I would try computing f by seeding a fast PRG with x and generating one value.

EDIT: given that the sets all have the same sums, don't choose f to be a degree 1 polynomial (e.g., the step function of a linear congruential generator).
0 讨论(0)

查看其它9个回答
发布评论:

提交评论
- 加载中...