Random number in the range 1 to sys.maxsize is always 1 mod 2^10

后端未结

关注

 3  1245

I am trying to find the statistical properties of the PRNGs available in Python (2.7.10) by using the frequency test, runs test and the chi squared test.

For carryin

相关标签:

3条回答

无人及你

2021-01-01 09:11

That depends on a lot of things, like how exactly the RNG is implemented, how much bits of state it uses, and how exactly the sample function is implemented.

Here's what the documentation says:

Almost all module functions depend on the basic function random(), which generates a random float uniformly in the semi-open range [0.0, 1.0). Python uses the Mersenne Twister as the core generator. It produces 53-bit precision floats and has a period of 2**19937-1.

So if the sample indeed uses random() under the hood, then you should only expect 53 bits of meaningful bits in your result.

0 讨论(0)
发布评论:

提交评论
- 加载中...
太阳男子

2021-01-01 09:22
@roeland hinted at the cause: in Python 2, sample() uses int(random.random() * n) repeatedly. Look at the source code (in your Python's Lib/random.py) for full details. In short, random.random() returns no more than 53 significant (non-zero) leading bits; then int() fills the rest of the low-order bits with zeroes (you're obviously on a machine where sys.maxsize == 2**63 - 1); then indexing your base (xrange(1, sys.maxsize)) by an even integer with "a lot" of of low-order 0 bits always returns an odd integer with the same number of low-order 0 bits (except for the last).

In Python 3 none of that happens - random in Python 3 uses stronger algorithms, and only falls back to random.random() when necessary. For example, here under Python 3.4.3:
```
>>> hex(random.randrange(10**70))
'0x91fc11ed768be3a454bd66f593c218d8bbfa3b99f6285291e1d9f964a9'
>>> hex(random.randrange(10**70))
'0x7b07ff02b6676801e33094fca2fcca7f6e235481c479c521643b1acaf4'
```
EDIT

Here's a more directly relevant example, under 3.4.3 on a 64-bit box:
```
>>> import random, sys
>>> sys.maxsize == 2**63 - 1
True
>>> for i in random.sample(range(1, sys.maxsize), 6):
...    print(bin(i))
0b10001100101001001111110110011111000100110100111001100000010110
0b100111100110110100111101001100001100110001110010000101101000101
0b1100000001110000110100111101101010110001100110101111011100111
0b111110100001111100101001001001101101100100011001001010100001110
0b1100110100000011100010000011010010100100110111001111100110100
0b10011010000110101010101110001000101110111100100001111101110111
```
Python 3 doesn't invoke random.random() at all in this case, but instead iteratively grabs chunks of 32 bits from the underlying Mersenne Twister (32-bit unsigned ints are "the natural" outputs from this implementation of MT) , pasting them together to build a suitable index. So, in Python 3, platform floats have nothing to do with it; in Python 2, quirks of float behavior have everything to do with it.
0 讨论(0)
发布评论:

提交评论
- 加载中...
谎友^

2021-01-01 09:30

It certainly looks like rounding error in random.sample.

The bottom 4 or so bits are always zero after the multiplication by the spread of the range (maxsize -1) then when the start of the range (1) is added they are always 1

if the multiplication was working correctly, given that the spread is not a power of two, and given that the random number only has 53 varying bits I'd expect to see varying values in the rightmost bits too.

0 讨论(0)
发布评论:

提交评论
- 加载中...

Random number in the range 1 to sys.maxsize is always 1 mod 2^10

EDIT