问题
I'm using this with a length of 20 for uuid. Is it common practice to not check if the uuid generated has not been used already if it's used for a persistent unique value?
Or is it best practice to verify it's not already being used by some part of your application if it's essential to retain uniqueness.
回答1:
You can calculate the probability of a collision using this formula from Wikipedia::
where n(p; H) is the smallest number of samples you have to choose in order to find a collision with a probability of at least p, given H possible outputs with equal probability.
The same article also provides Python source code that you can use to calculate this value:
from math import log1p, sqrt
def birthday(probability_exponent, bits):
probability = 10. ** probability_exponent
outputs = 2. ** bits
return sqrt(2. * outputs * -log1p(-probability))
So if you're generating UUIDs with 20 bytes (160 bits) of random data, how sure can you be that there won't be any collisions? Let's suppose you want there to be a probability of less than one in a quintillion (10–18) that a collision will occur:
>>> birthday(-18,160)
1709679290002018.5
This means that after generating about 1.7 quadrillion UUIDs with 20 bytes of random data each, there is only a one in 1 a quintillion chance that two of these UUIDs will be the same.
Basically, 20 bytes is perfectly adequate.
回答2:
crypto.RandomBytes is safe enough for most applications. If you want it to by completely secure, use a length of 16. Once there is a length of 16 there will likely never be a collision in the nearest century. And it is definitely not a good idea to check an entire database for any duplicates, because the odds are so low that the performance debuff outweighs the security.
来源:https://stackoverflow.com/questions/49267840/are-the-odds-of-a-cryptographically-secure-random-number-generator-generating-th