For example, if there are 5 numbers 1, 2, 3, 4, 5
I want a random result like
[[ 2, 3, 1, 4, 5]
[ 5, 1, 2, 3, 4]
[ 3, 2, 4, 5, 1]
[ 1, 4, 5, 2, 3]
I experimented with a brute-force random choice. Generate a row, and if valid, add to the accumulated lines:
def foo(n=5,maxi=200):
arr = np.random.choice(numbers,n, replace=False)[None,:]
for i in range(maxi):
row = np.random.choice(numbers,n, replace=False)[None,:]
if (arr==row).any(): continue
arr = np.concatenate((arr, row),axis=0)
if arr.shape[0]==n: break
print(i)
return arr
Some sample runs:
In [66]: print(foo())
199
[[1 5 4 2 3]
[4 1 5 3 2]
[5 3 2 1 4]
[2 4 3 5 1]]
In [67]: print(foo())
100
[[4 2 3 1 5]
[1 4 5 3 2]
[5 1 2 4 3]
[3 5 1 2 4]
[2 3 4 5 1]]
In [68]: print(foo())
57
[[1 4 5 3 2]
[2 1 3 4 5]
[3 5 4 2 1]
[5 3 2 1 4]
[4 2 1 5 3]]
In [69]: print(foo())
174
[[2 1 5 4 3]
[3 4 1 2 5]
[1 3 2 5 4]
[4 5 3 1 2]
[5 2 4 3 1]]
In [76]: print(foo())
41
[[3 4 5 1 2]
[1 5 2 3 4]
[5 2 3 4 1]
[2 1 4 5 3]
[4 3 1 2 5]]
The required number of tries varies all over the place, with some exceeding my iteration limit.
Without getting into any theory, there's going to be difference between quickly generating a 2d permutation, and generating one that is in some sense or other, maximally random. I suspect my approach is closer to this random goal than a more systematic and efficient approach (but I can't prove it).
def opFoo():
numbers = list(range(1,6))
result = np.zeros((5,5), dtype='int32')
row_index = 0; i = 0
while row_index < 5:
np.random.shuffle(numbers)
for column_index, number in enumerate(numbers):
if number in result[:, column_index]:
break
else:
result[row_index, :] = numbers
row_index += 1
i += 1
return i, result
In [125]: opFoo()
Out[125]:
(11, array([[2, 3, 1, 5, 4],
[4, 5, 1, 2, 3],
[3, 1, 2, 4, 5],
[1, 3, 5, 4, 2],
[5, 3, 4, 2, 1]]))
Mine is quite a bit slower than the OP's, but mine is correct.
This is an improvement on mine (2x faster):
def foo1(n=5,maxi=300):
numbers = np.arange(1,n+1)
np.random.shuffle(numbers)
arr = numbers.copy()[None,:]
for i in range(maxi):
np.random.shuffle(numbers)
if (arr==numbers).any(): continue
arr = np.concatenate((arr, numbers[None,:]),axis=0)
if arr.shape[0]==n: break
return arr, i
Why is translated Sudoku solver slower than original?
I found that with this translation of Java Sudoku solver, that using Python lists was faster than numpy arrays.
I may try to adapt that script to this problem - tomorrow.
Just for your information, what you are looking for is a way of generating latin squares. As for the solution, it depends on how much random "random" is for you.
I would devise at least four main techniques, two of which have been already proposed. Hence, I will briefly describe the other two:
Assuming we work with standard Python data types since I do not see a real merit in using NumPy (but results can be easily converted to np.ndarray
if necessary), this would be in code (the first function is just to check that the solution is actually correct):
import random
import math
import itertools
# this only works for Iterable[Iterable]
def is_latin_rectangle(rows):
valid = True
for row in rows:
if len(set(row)) < len(row):
valid = False
if valid and rows:
for i, val in enumerate(rows[0]):
col = [row[i] for row in rows]
if len(set(col)) < len(col):
valid = False
break
return valid
def is_latin_square(rows):
return is_latin_rectangle(rows) and len(rows) == len(rows[0])
# : prepare the input
n = 9
items = list(range(1, n + 1))
# shuffle items
random.shuffle(items)
# number of permutations
print(math.factorial(n))
def latin_square1(items, shuffle=True):
result = []
for elems in itertools.permutations(items):
valid = True
for i, elem in enumerate(elems):
orthogonals = [x[i] for x in result] + [elem]
if len(set(orthogonals)) < len(orthogonals):
valid = False
break
if valid:
result.append(elems)
if shuffle:
random.shuffle(result)
return result
rows1 = latin_square1(items)
for row in rows1:
print(row)
print(is_latin_square(rows1))
def latin_square2(items, shuffle=True, forward=False):
sign = -1 if forward else 1
result = [items[sign * i:] + items[:sign * i] for i in range(len(items))]
if shuffle:
random.shuffle(result)
return result
rows2 = latin_square2(items)
for row in rows2:
print(row)
print(is_latin_square(rows2))
rows2b = latin_square2(items, False)
for row in rows2b:
print(row)
print(is_latin_square(rows2b))
For comparison, an implementation by trying random permutations and accepting valid ones (fundamentally what @hpaulj proposed) is also presented.
def latin_square3(items):
result = [list(items)]
while len(result) < len(items):
new_row = list(items)
random.shuffle(new_row)
result.append(new_row)
if not is_latin_rectangle(result):
result = result[:-1]
return result
rows3 = latin_square3(items)
for row in rows3:
print(row)
print(is_latin_square(rows3))
I did not have time (yet) to implement the other method (with backtrack Sudoku-like solutions from @ConfusedByCode).
With timings for n = 5
:
%timeit latin_square1(items)
321 µs ± 24.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit latin_square2(items)
7.5 µs ± 222 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit latin_square2(items, False)
2.21 µs ± 69.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit latin_square3(items)
2.15 ms ± 102 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
... and for n = 9
:
%timeit latin_square1(items)
895 ms ± 18.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit latin_square2(items)
12.5 µs ± 200 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit latin_square2(items, False)
3.55 µs ± 55.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit latin_square3(items)
The slowest run took 36.54 times longer than the fastest. This could mean that an intermediate result is being cached.
9.76 s ± 9.23 s per loop (mean ± std. dev. of 7 runs, 1 loop each)
So, solution 1 is giving a fair deal of randomness but it is not terribly fast (and scale with O(n!)
), solution 2 (and 2b) are much faster (scaling with O(n)
) but not as random as solution 1. Solution 3 is very slow and the performance can vary significantly (can probably be sped up by letting the last iteration be computed instead of guessed).
Getting more technical, other efficient algorithms are discussed in:
This may seem odd, but you have basically described generating a random n-dimension Sudoku puzzle. From a blog post by Daniel Beer:
The basic approach to solving a Sudoku puzzle is by a backtracking search of candidate values for each cell. The general procedure is as follows:
Generate, for each cell, a list of candidate values by starting with the set of all possible values and eliminating those which appear in the same row, column and box as the cell being examined.
Choose one empty cell. If none are available, the puzzle is solved.
If the cell has no candidate values, the puzzle is unsolvable.
For each candidate value in that cell, place the value in the cell and try to recursively solve the puzzle.
There are two optimizations which greatly improve the performance of this algorithm:
When choosing a cell, always pick the one with the fewest candidate values. This reduces the branching factor. As values are added to the grid, the number of candidates for other cells reduces too.
When analysing the candidate values for empty cells, it's much quicker to start with the analysis of the previous step and modify it by removing values along the row, column and box of the last-modified cell. This is O(N) in the size of the puzzle, whereas analysing from scratch is O(N3).
In your case an "unsolvable puzzle" is an invalid matrix. Every element in the matrix will be unique on both axis in a solvable puzzle.
EDIT: Below is an implementation of the second solution in norok2's answer.
EDIT: we can shuffle the generated square again to make it real random. So the solve functions can be modified to:
def solve(numbers):
shuffle(numbers)
shift = randint(1, len(numbers)-1)
res = []
for r in xrange(len(numbers)):
res.append(list(numbers))
numbers = list(numbers[shift:] + numbers[0:shift])
rows = range(len(numbers))
shuffle(rows)
shuffled_res = []
for i in xrange(len(rows)):
shuffled_res.append(res[rows[i]])
return shuffled_res
EDIT: I previously misunderstand the question. So, here's a 'quick' method which generates a 'to-some-extent' random solutions. The basic idea is,
a, b, c
b, c, a
c, a, b
We can just move a row of data by a fixed step to form the next row. Which will qualify our restriction.
So, here's the code:
from random import shuffle, randint
def solve(numbers):
shuffle(numbers)
shift = randint(1, len(numbers)-1)
res = []
for r in xrange(len(numbers)):
res.append(list(numbers))
numbers = list(numbers[shift:] + numbers[0:shift])
return res
def check(arr):
for c in xrange(len(arr)):
col = [arr[r][c] for r in xrange(len(arr))]
if len(set(col)) != len(col):
return False
return True
if __name__ == '__main__':
from pprint import pprint
res = solve(range(5))
pprint(res)
print check(res)
This is a possible solution by itertools, if you don't insist on using numpy which I'm not familiar with:
import itertools
from random import randint
list(itertools.permutations(range(1, 6)))[randint(0, len(range(1, 6))]
# itertools returns a iterator of all possible permutations of the given list.
Can't type code from the phone, here's the pseudocode:
Create a matrix with one diamention more than tge target matrix(3 d)
Initialize the 25 elements with numbers from 1 to 5
Iterate over the 25 elements.
Choose a random value for the first element from the element list(which contains numbers 1 through 5)
Remove the randomly chosen value from all the elements in its row and column.
Repeat steps 4 and 5 for all the elements.