Probability of collision when using a 32 bit hash

后端 未结 2 1593
温柔的废话
温柔的废话 2020-11-28 03:11

I have a 10 character string key field in a database. I\'ve used CRC32 to hash this field but I\'m worry about duplicates. Could somebody show me the probability of collisio

相关标签:
2条回答
  • 2020-11-28 03:18

    In the case you cite, at least one collision is essentially guaranteed. The probability of at least one collision is about 1 - 3x10-51. The average number of collisions you would expect is about 116.

    In general, the average number of collisions in k samples, each a random choice among n possible values is:

    N(n,k)~=k(k-1)/(2n)

    The probability of at least one collision is:

    p(n,k)~=1-e^(-k(k-1)/(2n))

    In your case, n = 232 and k = 106.

    The probability of a three-way collision in your case is about 0.01. See the Birthday Problem.

    0 讨论(0)
  • 2020-11-28 03:40

    Duplicate of Expected collisions for perfect 32bit crc

    The answer referenced this article: http://arstechnica.com/civis/viewtopic.php?f=20&t=149670

    Found the image below from: http://preshing.com/20110504/hash-collision-probabilities

    enter image description here

    0 讨论(0)
提交回复
热议问题