Multiplying a huge number times random() (Python)

前端 未结 3 502
独厮守ぢ
独厮守ぢ 2020-11-30 16:10

Problem: Generate large binary strings (length 2000+). Do it quickly, as this generateRandom() function will be called 300,000 times in the algorithm.

相关标签:
3条回答
  • 2020-11-30 16:21

    To go from J.F. Sebastian's answer to a binary string (string with 0 and 1 characters in it):

    >>> import random
    >>> r = random.SystemRandom()
    >>> bin(r.getrandbits(2000))[2:].zfill(2000)

    >>> bin(r.getrandbits(2000))[2:].zfill(2000)

    >>> bin(r.getrandbits(2000))[2:].zfill(2000)

    

    With this benchmark:

    import random
    import time
    
    def run(n):
        r = random.SystemRandom()
        for i in xrange(n):
            if i%30000 == 0: print i
            bin(r.getrandbits(2000))[2:].zfill(2000)
    
    s = time.time()
    run(300000)
    e = time.time()
    print "Took %.2fs" % (e-s,)
    

    The result was Took 12.32s

    Just getting the random bits without any string conversion (only r.getrandbits(2000)) took 7.77s, so if you could find a way to use the random bits as a long then you'd save yourself some time.

    Re-running the benchmark using os.urandom(250) instead (without additional processing) took only 3.59s, so that seems to be the fastest option.

    0 讨论(0)
  • 2020-11-30 16:34

    Measure whether it is fast enough for your purposes, "randomness" might diminish the more you call it: os.urandom(250). It produces a binary string aka bytes.

    To avoid "long int too large to convert to float" error don't use floats.

    If you need an integer with k random bits instead of a binary string:

    import random
    r = random.SystemRandom()
    
    n = r.getrandbits(2000) # uses os.urandom() under the hood
    

    To get a string of "0"s and "1"s:

    k = 2000
    binstr = "{:0{}b}".format(r.getrandbits(k), k)
    

    Note: you can't use randint/randrange for large numbers if getrandbits is not used:

    import random
    
    class R(random.Random):
        def random(self): # override random to suppress getrandbits usage
            return random.random()
    
    r = R()
    r.randrange(2**2000) # -> OverflowError: long int too large to convert to float
    

    b2a_bin

    b2a_bin() extension allows to create binary strings ("01") directly from bytestrings without creating an intermediate Python integer. It is 3-20 times faster than pure Python analogs:

    def b2a_bin_bin(data):
        return bin(int.from_bytes(data, 'big', signed=False)
                   )[2:].zfill(len(data)*8).encode('ascii', 'strict')
    
    def b2a_bin_format(data):
        n = int.from_bytes(data, 'big', signed=False)
        return "{:0{}b}".format(n, len(data)*8).encode('ascii', 'strict')
    

    Usage:

    >>> import os
    >>> from b2a_bin import b2a_bin
    >>> b2a_bin.b2a_bin(b'\x0a')
    b'00001010'
    >>> b2a_bin(os.urandom(5))
    b'1001111011000011111001110010000101111010'
    
    0 讨论(0)
  • 2020-11-30 16:35

    Is random.randrange really too slow? Let's see how slow it really is.

    import random
    
    word_size = 2048
    word_max = 2 ** word_size
    
    def random_bits(n):
        """
        Return a string consisting of `n` zeroes and ones (chosen randomly).
        """
        def words():
            s, m, r = word_size, word_max, n % word_size
            for _ in range(n // s):
                yield bin(random.randrange(m))[2:].zfill(s)
            yield bin(random.randrange(2 ** r))[2:].zfill(r)
        return ''.join(words())
    
    >>> from timeit import Timer
    >>> Timer(lambda:random_bits(2000)).timeit(number=300000)
    9.680696964263916
    

    10 seconds doesn't seem an absurd amount of time for choosing 600 million random bits. So perhaps you can say more about your speed requirement. Is this really too slow?

    0 讨论(0)
提交回复
热议问题