Is python's random number generation easily reproducible?

前端未结

关注

 4  1657

轮回少年 2021-01-13 12:01

I was reading about python\'s random module in standard library. It amazes me that when I set the seed and produce a few random numbers:

random.seed(1)
for i


      
      
        
          4条回答        

        
                    
            
            
                         
                
              
              
                
                   余生分开走
                                             
                
                
                (楼主)
            
              
              
                2021-01-13 12:31
              

            
            
                        
The existing answers are great, but I'll just add a few points. 

Update:

Actually, if you don't supply a seed, the random number generator is seeded with random bits from the system random source, it only falls back to using the system time as a seed if the OS doesn't have a random source. Also note that recent versions of Python can use an improved seeding scheme. From the docs:


  random.seed(a=None, version=2)
  
  Initialize the random number generator.
  
  If a is omitted or None, the current system time is used. If
  randomness sources are provided by the operating system, they are used
  instead of the system time (see the os.urandom() function for
  details on availability).
  
  If a is an int, it is used directly.
  
  With version 2 (the default), a str, bytes, or bytearray object gets
  converted to an int and all of its bits are used.
  
  With version 1 (provided for reproducing random sequences from older
  versions of Python), the algorithm for str and bytes generates a
  narrower range of seeds.
  
  Changed in version 3.2: Moved to the version 2 scheme which uses all of the bits in a string seed.




Generating a CAPTCHA code is not a high-security application compared to say, generating secret cryptographic keys, especially keys that are intended to be used multiple times. As a corollary, the amount of entropy required to generate a CAPTCHA code is smaller than what's required for a cryptographic key. 

Bear in mind that the system time used to seed random is (probably) not the system time in seconds - it's more likely to be the time in microseconds, or even nanoseconds, so it's not easy for an attacker to figure the seed out from a brute-search, apart from the considerations mentioned by Ned.

Here's a quick demo, running on Python 2.6.6 on a 2GHz Linux system. 

#!/usr/bin/env python
''' random seeding demo'''

from __future__ import print_function
import time
from random import seed, randint, random

def rf():
    return randint(10, 99)

def put_time():
    print('%.15f' % time.time())

r = range(10)
a = []

put_time()
for i in r:
    seed()
    a.append([rf() for j in r])
put_time()

for row in a:
    print(row)


Typical output

1436617059.071794986724854
1436617059.074091911315918
[95, 25, 50, 75, 80, 38, 21, 26, 85, 82]
[75, 96, 14, 13, 76, 53, 94, 68, 80, 66]
[79, 33, 65, 86, 12, 32, 80, 83, 36, 42]
[28, 47, 62, 21, 52, 30, 54, 62, 22, 28]
[22, 40, 71, 36, 78, 64, 17, 33, 99, 43]
[81, 15, 32, 15, 63, 57, 83, 67, 12, 62]
[22, 56, 54, 55, 51, 56, 34, 56, 94, 16]
[64, 82, 37, 80, 70, 91, 56, 41, 55, 12]
[47, 37, 64, 14, 69, 65, 42, 17, 22, 17]
[43, 43, 73, 82, 61, 55, 32, 52, 86, 74]


As you can see, less than 3 milliseconds elapse between the start of the outer loop & its end, but all of the lists in a are quite different. 

Note that the seed passed to random.seed() can be any hashable object, and when you pass it a non-integer (eg a float like the system time), it first gets hashed to create an integer. 

Still, there's no need to merely use the system time as the seed: you can use SystemRandom / os.urandom() to get the seed. That way, the seed is more unpredictable, but you get the speed of Mersenne Twister; SystemRandom is a little slower than Mersenne Twister because it has to make system calls. However, even urandom isn't totally safe.

From the GNU urandom man page:


  The random number generator gathers environmental noise from device
  drivers and other sources into an entropy pool. The generator also
  keeps an estimate of the number of bits of noise in the entropy pool.
  From this entropy pool random numbers are created. 
  
  When read, the /dev/random device will only return random bytes
  within the estimated number of bits of noise in the entropy pool.
  /dev/random should be suitable for uses that need very high quality
  randomness such as one-time pad or key generation. When the entropy
  pool is empty, reads from /dev/random will block until additional
  environmental noise is gathered. 
  
  A read from the /dev/urandom device will not block waiting for more
  entropy. As a result, if there is not sufficient entropy in the
  entropy pool, the returned values are theoretically vulnerable to a
  cryptographic attack on the algorithms used by the driver. Knowledge
  of how to do this is not available in the current unclassified
  literature, but it is theoretically possible that such an attack may
  exist. If this is a concern in your application, use /dev/random
  instead.  
  
  Usage
  
  If you are unsure about whether you should use
  /dev/random or /dev/urandom, then probably you want to use the latter.
  As a general rule, /dev/urandom should be used for everything except
  long-lived GPG/SSL/SSH keys.

    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它4个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复