问题
I want to shuffle a long sequence (say it is has more than 10000 elements)a lot of times (say 10000). When reading Python Random documentation, I found the following:
Note that even for small len(x), the total number of permutations of x can quickly grow larger than the period of most random number generators. This implies that most permutations of a long sequence can never be generated. For example, a sequence of length 2080 is the largest that can fit within the period of the Mersenne Twister random number generator
I have two groups (could be more) and each has many values. The sequence I want to shuffle is the list of all values available regardless of the group. My concern is that the note implies that the shuffle I need may not be provided by the random.shuffle() function.
I have thought about some workarounds:
- Initialize the random number generator (with random.seed()) several in certain iterations. That way, it does not matter if the permutations are more than the period because different seeds will get different results.
- Use sample(range(length of sequence), k=size of a group) to get random indices an then use those to index within each group. That way I may not run out of permutations due to the period of the random number generator.
Would any of my alternatives help?
Thanks a lot!
回答1:
Well 10,000! ~= 10^36,000
That is a lot of possible permutations. The best you could do is to delve into how your operating system or hardware accumulates "truly random" bits. You could then wait for ~120,000 bits of randomness that you are OK with then use the algorithm that generates the n'th permutation of your input list given that random n.
回答2:
You can use numpy shuffle function to shuffle the list elements in-place
import numpy as np
L = range(0, 10000)
np.random.shuffle(L)
Timing the shuffle call (in Jupyter)
%timeit np.random.shuffle(L)
you get
10000 loops, best of 3: 182 µs per loop
来源:https://stackoverflow.com/questions/46859257/shuffle-a-long-list-an-even-longer-number-of-times-in-python