Say I have a list of valid X = [1, 2, 3, 4, 5]
and a list of valid Y = [1, 2, 3, 4, 5]
.
I need to generate all combinations of every element in
Distribute the x
values (5 times each value) evenly across your output:
import random
def random_combo_without_x_repeats(xvals, yvals):
# produce all valid combinations, but group by `x` and shuffle the `y`s
grouped = [[x, random.sample(yvals, len(yvals))] for x in xvals]
last_x = object() # sentinel not equal to anything
while grouped[0][1]: # still `y`s left
for _ in range(len(xvals)):
# shuffle the `x`s, but skip any ordering that would
# produce consecutive `x`s.
random.shuffle(grouped)
if grouped[0][0] != last_x:
break
else:
# we tried to reshuffle N times, but ended up with the same `x` value
# in the first position each time. This is pretty unlikely, but
# if this happens we bail out and just reverse the order. That is
# more than good enough.
grouped = grouped[::-1]
# yield a set of (x, y) pairs for each unique x
# Pick one y (from the pre-shuffled groups per x
for x, ys in grouped:
yield x, ys.pop()
last_x = x
This shuffles the y
values per x
first, then gives you a x, y
combination for each x
. The order in which the x
s are yielded is shuffled each iteration, where you test for the restriction.
This is random, but you'll get all numbers between 1 and 5 in the x
position before you'll see the same number again:
>>> list(random_combo_without_x_repeats(range(1, 6), range(1, 6)))
[(2, 1), (3, 2), (1, 5), (5, 1), (4, 1),
(2, 4), (3, 1), (4, 3), (5, 5), (1, 4),
(5, 2), (1, 1), (3, 3), (4, 4), (2, 5),
(3, 5), (2, 3), (4, 2), (1, 2), (5, 4),
(2, 2), (3, 4), (1, 3), (4, 5), (5, 3)]
(I manually grouped that into sets of 5). Overall, this makes for a pretty good random shuffling of a fixed input set with your restriction.
It is efficient too; because there is only a 1-in-N chance that you have to re-shuffle the x
order, you should only see one reshuffle on average take place during a full run of the algorithm. The whole algorithm stays within O(N*M) boundaries therefor, pretty much ideal for something that produces N times M elements of output. Because we limit the reshuffling to N times at most before falling back to a simple reverse we avoid the (extremely unlikely) posibility of endlessly reshuffling.
The only drawback then is that it has to create N copies of the M y values up front.