Python: where is random.random() seeded?

天大地大妈咪最大 提交于 2019-12-30 21:08:00

问题


Say I have some python code:

import random
r=random.random()

Where is the value of r seeded from in general?
And what if my OS has no random, then where is it seeded?
Why isn't this recommended for cryptography? Is there some way to know what the random number is?


回答1:


Follow da code.

To see where the random module "lives" in your system, you can just do in a terminal:

>>> import random
>>> random.__file__
'/usr/lib/python2.7/random.pyc'

That gives you the path to the .pyc ("compiled") file, which is usually located side by side to the original .py where readable code can be found.

Let's see what's going on in /usr/lib/python2.7/random.py:

You'll see that it creates an instance of the Random class and then (at the bottom of the file) "promotes" that instance's methods to module functions. Neat trick. When the random module is imported anywhere, a new instance of that Random class is created, its values are then initialized and the methods are re-assigned as functions of the module, making it quite random on a per-import (erm... or per-python-interpreter-instance) basis.

_inst = Random()
seed = _inst.seed
random = _inst.random
uniform = _inst.uniform
triangular = _inst.triangular
randint = _inst.randint

The only thing that this Random class does in its __init__ method is seeding it:

class Random(_random.Random):
    ...
    def __init__(self, x=None):
        self.seed(x)    
...
_inst = Random()
seed = _inst.seed

So... what happens if x is None (no seed has been specified)? Well, let's check that self.seed method:

def seed(self, a=None):
    """Initialize internal state from hashable object.

    None or no argument seeds from current time or from an operating
    system specific randomness source if available.

    If a is not None or an int or long, hash(a) is used instead.
    """

    if a is None:
        try:
            a = long(_hexlify(_urandom(16)), 16)
        except NotImplementedError:
            import time
            a = long(time.time() * 256) # use fractional seconds

    super(Random, self).seed(a)
    self.gauss_next = None

The comments already tell what's going on... This method tries to use the default random generator provided by the OS, and if there's none, then it'll use the current time as the seed value.

But, wait... What the heck is that _urandom(16) thingy then?

Well, the answer lies at the beginning of this random.py file:

from os import urandom as _urandom
from binascii import hexlify as _hexlify

Tadaaa... The seed is a 16 bytes number that came from os.urandom

Let's say we're in a civilized OS, such as Linux (with a real random number generator). The seed used by the random module is the same as doing:

>>> long(binascii.hexlify(os.urandom(16)), 16)
46313715670266209791161509840588935391L

The reason of why specifying a seed value is considered not so great is that the random functions are not really "random"... They're just a very weird sequence of numbers. But that sequence will be the same given the same seed. You can try this yourself:

>>> import random
>>> random.seed(1)
>>> random.randint(0,100)
13
>>> random.randint(0,100)
85
>>> random.randint(0,100)
77

No matter when or how or even where you run that code (as long as the algorithm used to generate the random numbers remains the same), if your seed is 1, you will always get the integers 13, 85, 77... which kind of defeats the purpose (see this about Pseudorandom number generation) On the other hand, there are use cases where this can actually be a desirable feature, though.

That's why is considered "better" relying on the operative system random number generator. Those are usually calculated based on hardware interruptions, which are very, very random (it includes interruptions for hard drive reading, keystrokes typed by the human user, moving a mouse around...) In Linux, that O.S. generator is /dev/random. Or, being a tad picky, /dev/urandom (that's what Python's os.urandom actually uses internally) The difference is that (as mentioned before) /dev/random uses hardware interruptions to generate the random sequence. If there are no interruptions, /dev/random could be exhausted and you might have to wait a little bit until you can get the next random number. /dev/urandom uses /dev/random internally, but it guarantees that it will always have random numbers ready for you.

If you're using linux, just do cat /dev/random on a terminal (and prepare to hit Ctrl+C because it will start output really, really random stuff)

borrajax@borrajax:/tmp$ cat /dev/random
_+�_�?zta����K�����q�ߤk��/���qSlV��{�Gzk`���#p$�*C�F"�B9��o~,�QH���ɭ�f�޺�̬po�2o𷿟�(=��t�0�p|m�e
���-�5�߁ٵ�ED�l�Qt�/��,uD�w&m���ѩ/��;��5Ce�+�M����
~ �4D��XN��?ס�d��$7Ā�kte▒s��ȿ7_���-     �d|����cY-�j>�
                    �b}#�W<դ���8���{�1»
.       75���c4$3z���/̾�(�(���`���k�fC_^C

Python uses the OS random generator or a time as a seed. This means that the only place where I could imagine a potential weakness with Python's random module is when it's used:

  • In an OS without an actual random number generator, and
  • In a device where time.time is always reporting the same time (has a broken clock, basically)

If you are concerned about the actual randomness of the random module, you can either go directly to os.urandom or use the random number generator in the pycrypto cryptographic library. Those are probably more random. I say more random because...

Image inspiration came from this other SO answer



来源:https://stackoverflow.com/questions/27284943/python-where-is-random-random-seeded

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!