Whats more random, hashlib or urandom?

后端 未结 5 1371
旧时难觅i
旧时难觅i 2021-02-05 07:24

I\'m working on a project with a friend where we need to generate a random hash. Before we had time to discuss, we both came up with different approaches and because they are us

相关标签:
5条回答
  • 2021-02-05 07:54

    Testing randomness is notoriously difficult - however, I would chose the second method, but ONLY (or, only as far as comes to mind) for this case, where the hash is seeded by a random number.

    The whole point of hashes is to create a number that is vastly different based on slight differences in input. For your use case, the randomness of the input should do. If, however, you wanted to hash a file and detect one eensy byte's difference, that's when a hash algorithm shines.

    I'm just curious, though: why use a hash algorithm at all? It seems that you're looking for a purely random number, and there are lots of libraries that generate uuid's, which have far stronger guarantees of uniqueness than random number generators.

    0 讨论(0)
  • 2021-02-05 08:09

    The second solution clearly has more entropy than the first. Assuming the quality of the source of the random bits would be the same for os.urandom and random.random:

    • In the second solution you are fetching 16 bytes = 128 bits worth of randomness
    • In the first solution you are fetching a floating point value which has roughly 52 bits of randomness (IEEE 754 double, ignoring subnormal numbers, etc...). Then you hash it around, which, of course, doesn't add any randomness.

    More importantly, the quality of the randomness coming from os.urandom is expected and documented to be much better than the randomness coming from random.random. os.urandom's docstring says "suitable for cryptographic use".

    0 讨论(0)
  • 2021-02-05 08:12

    This solution:

    os.urandom(16).encode('hex')
    

    is the best since it uses the OS to generate randomness which should be usable for cryptographic purposes (depends on the OS implementation).

    random.random() generates pseudo-random values.

    Hashing a random value does not add any new randomness.

    0 讨论(0)
  • 2021-02-05 08:15

    random.random() is a pseudo-radmom generator, that means the numbers are generated from a sequence. if you call random.seed(some_number), then after that the generated sequence will always be the same.

    os.urandom() get's the random numbers from the os' rng, which uses an entropy pool to collect real random numbers, usually by random events from hardware devices, there exist even random special entropy generators for systems where a lot of random numbers are generated.

    on unix system there are traditionally two random number generators: /dev/random and /dev/urandom. calls to the first block if there is not enough entropy available, whereas when you read /dev/urandom and there is not enough entropy data available, it uses a pseudo-rng and doesn't block.

    so the use depends usually on what you need: if you need a few, equally distributed random numbers, then the built in prng should be sufficient. for cryptographic use it's always better to use real random numbers.

    0 讨论(0)
  • 2021-02-05 08:17

    if you want a unique identifier (uuid), then you should use

    import uuid
    uuid.uuid4().hex
    

    https://docs.python.org/3/library/uuid.html

    0 讨论(0)
提交回复
热议问题