Hash function on list independant of order of items in it

后端 未结 9 830
无人共我
无人共我 2021-01-19 21:54

I want to have a dictionary that assigns a value to a set of integers.

For example key is [1 2 3] and value will have certain

9条回答
  •  执笔经年
    2021-01-19 22:38

    Basically all of the approaches here are instantiations of the same template. Map x1, …, xn to f(x1) op … op f(xn), where op is a commutative associative operation on some set X, and f is a map from items to X. This template has been used a couple of times in ways that are provably good.

    • Choose a random large prime p and a random residue b in [1, p - 1]. Let f(x) = bx mod p and let op be addition. We essentially interpret a set as a polynomial and use the Schwartz–Zippel lemma to bound the probability of a collision (= the probability that a nonzero polynomial has b as a root mod p).

    • Let op be XOR and let f be a randomly chosen table. This is Zobrist hashing and minimizes in expectation the number of collisions by straightforward linear-algebraic arguments.

    Modular exponentiation is slow, so don't use it. As for Zobrist hashing, with 3 million items, the table f probably won't fit into L2, though it does set an upper bound of one main-memory access.

    I would instead take Zobrist hashing as a departure point and look for a cheap function f that behaves like a random function. This is essentially the job description of a non-cryptographic pseudorandom generator – I would try computing f by seeding a fast PRG with x and generating one value.

    EDIT: given that the sets all have the same sums, don't choose f to be a degree 1 polynomial (e.g., the step function of a linear congruential generator).

提交回复
热议问题