Efficient way of generating latin squares (or randomly permute numbers in matrix uniquely on both axes - using NumPy)

前端未结

关注

 5  1615

For example, if there are 5 numbers 1, 2, 3, 4, 5

I want a random result like

[[ 2, 3, 1, 4, 5]
 [ 5, 1, 2, 3, 4]
 [ 3, 2, 4, 5, 1]
 [ 1, 4, 5, 2, 3]


                      
              相关标签:


      
      
        
          5条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  独厮守ぢ        
                
              
                            
                2021-01-05 00:59
              
            
            
                                                                       
I experimented with a brute-force random choice.  Generate a row, and if valid, add to the accumulated lines:

def foo(n=5,maxi=200):
    arr = np.random.choice(numbers,n, replace=False)[None,:]
    for i in range(maxi):
        row = np.random.choice(numbers,n, replace=False)[None,:]
        if (arr==row).any(): continue
        arr = np.concatenate((arr, row),axis=0)
        if arr.shape[0]==n: break
    print(i)
    return arr


Some sample runs:

In [66]: print(foo())
199
[[1 5 4 2 3]
 [4 1 5 3 2]
 [5 3 2 1 4]
 [2 4 3 5 1]]
In [67]: print(foo())
100
[[4 2 3 1 5]
 [1 4 5 3 2]
 [5 1 2 4 3]
 [3 5 1 2 4]
 [2 3 4 5 1]]
In [68]: print(foo())
57
[[1 4 5 3 2]
 [2 1 3 4 5]
 [3 5 4 2 1]
 [5 3 2 1 4]
 [4 2 1 5 3]]
In [69]: print(foo())
174
[[2 1 5 4 3]
 [3 4 1 2 5]
 [1 3 2 5 4]
 [4 5 3 1 2]
 [5 2 4 3 1]]
In [76]: print(foo())
41
[[3 4 5 1 2]
 [1 5 2 3 4]
 [5 2 3 4 1]
 [2 1 4 5 3]
 [4 3 1 2 5]]


The required number of tries varies all over the place, with some exceeding my iteration limit.

Without getting into any theory, there's going to be difference between quickly generating a 2d permutation, and generating one that is in some sense or other, maximally random.  I suspect my approach is closer to this random goal than a more systematic and efficient approach (but I can't prove it).



def opFoo():
    numbers = list(range(1,6))
    result = np.zeros((5,5), dtype='int32')
    row_index = 0; i = 0
    while row_index < 5:
        np.random.shuffle(numbers)
        for column_index, number in enumerate(numbers):
            if number in result[:, column_index]:
                break
            else:
                result[row_index, :] = numbers
                row_index += 1
        i += 1
    return i, result

In [125]: opFoo()
Out[125]: 
(11, array([[2, 3, 1, 5, 4],
        [4, 5, 1, 2, 3],
        [3, 1, 2, 4, 5],
        [1, 3, 5, 4, 2],
        [5, 3, 4, 2, 1]]))


Mine is quite a bit slower than the OP's, but mine is correct.



This is an improvement on mine (2x faster):

def foo1(n=5,maxi=300):
    numbers = np.arange(1,n+1)
    np.random.shuffle(numbers)
    arr = numbers.copy()[None,:]
    for i in range(maxi):
        np.random.shuffle(numbers)
        if (arr==numbers).any(): continue
        arr = np.concatenate((arr, numbers[None,:]),axis=0)
        if arr.shape[0]==n: break
    return arr, i




Why is translated Sudoku solver slower than original?

I found that with this translation of Java Sudoku solver, that using Python lists was faster than numpy arrays.

I may try to adapt that script to this problem - tomorrow.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  佛祖请我去吃肉        
                
              
                            
                2021-01-05 01:08
              
            
            
                                                                       
Just for your information, what you are looking for is a way of generating latin squares.
As for the solution, it depends on how much random "random" is for you.

I would devise at least four main techniques, two of which have been already proposed.
Hence, I will briefly describe the other two:


loop through all possible permutations of the items and accept the first that satisfy the unicity constraint along rows
use only cyclic permutations to build subsequent rows: these are by construction satisfying the unicity constraint along rows (the cyclic transformation can be done forward or backward); for improved "randomness" the rows can be shuffled


Assuming we work with standard Python data types since I do not see a real merit in using NumPy (but results can be easily converted to np.ndarray if necessary), this would be in code (the first function is just to check that the solution is actually correct):

import random
import math
import itertools

# this only works for Iterable[Iterable]
def is_latin_rectangle(rows):
    valid = True
    for row in rows:
        if len(set(row)) < len(row):
            valid = False
    if valid and rows:
        for i, val in enumerate(rows[0]):
            col = [row[i] for row in rows]
            if len(set(col)) < len(col):
                valid = False
                break
    return valid

def is_latin_square(rows):
    return is_latin_rectangle(rows) and len(rows) == len(rows[0])

# : prepare the input
n = 9
items = list(range(1, n + 1))
# shuffle items
random.shuffle(items)
# number of permutations
print(math.factorial(n))


def latin_square1(items, shuffle=True):
    result = []
    for elems in itertools.permutations(items):
        valid = True
        for i, elem in enumerate(elems):
            orthogonals = [x[i] for x in result] + [elem]
            if len(set(orthogonals)) < len(orthogonals):
                valid = False
                break
        if valid:
            result.append(elems)
    if shuffle:
        random.shuffle(result)
    return result

rows1 = latin_square1(items)
for row in rows1:
    print(row)
print(is_latin_square(rows1))


def latin_square2(items, shuffle=True, forward=False):
    sign = -1 if forward else 1
    result = [items[sign * i:] + items[:sign * i] for i in range(len(items))]
    if shuffle:
        random.shuffle(result)
    return result

rows2 = latin_square2(items)
for row in rows2:
    print(row)
print(is_latin_square(rows2))

rows2b = latin_square2(items, False)
for row in rows2b:
    print(row)
print(is_latin_square(rows2b))


For comparison, an implementation by trying random permutations and accepting valid ones (fundamentally what @hpaulj proposed) is also presented.

def latin_square3(items):
    result = [list(items)]
    while len(result) < len(items):
        new_row = list(items)
        random.shuffle(new_row)
        result.append(new_row)
        if not is_latin_rectangle(result):
            result = result[:-1]
    return result

rows3 = latin_square3(items)
for row in rows3:
    print(row)
print(is_latin_square(rows3))


I did not have time (yet) to implement the other method (with backtrack Sudoku-like solutions from @ConfusedByCode).

With timings for n = 5:

%timeit latin_square1(items)
321 µs ± 24.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit latin_square2(items)
7.5 µs ± 222 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%timeit latin_square2(items, False)
2.21 µs ± 69.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%timeit latin_square3(items)
2.15 ms ± 102 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


... and for n = 9:

%timeit latin_square1(items)
895 ms ± 18.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit latin_square2(items)
12.5 µs ± 200 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%timeit latin_square2(items, False)
3.55 µs ± 55.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%timeit latin_square3(items)
The slowest run took 36.54 times longer than the fastest. This could mean that an intermediate result is being cached.
9.76 s ± 9.23 s per loop (mean ± std. dev. of 7 runs, 1 loop each)


So, solution 1 is giving a fair deal of randomness but it is not terribly fast (and scale with O(n!)), solution 2 (and 2b) are much faster (scaling with O(n)) but not as random as solution 1. Solution 3 is very slow and the performance can vary significantly (can probably be sped up by letting the last iteration be computed instead of guessed).

Getting more technical, other efficient algorithms are discussed in:


Jacobson, M. T. and Matthews, P. (1996), Generating uniformly distributed random latin squares. J. Combin. Designs, 4: 405-437. doi:10.1002/(SICI)1520-6610(1996)4:6<405::AID-JCD3>3.0.CO;2-J

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  北荒        
                
              
                            
                2021-01-05 01:17
              
            
            
                                                                       
This may seem odd, but you have basically described generating a random n-dimension Sudoku puzzle. From a blog post by Daniel Beer:

The basic approach to solving a Sudoku puzzle is by a backtracking search of candidate values for each cell. The general procedure is as follows:

Generate, for each cell, a list of candidate values by starting with the set of all possible values and eliminating those which appear in the same row, column and box as the cell being examined.

Choose one empty cell. If none are available, the puzzle is solved.

If the cell has no candidate values, the puzzle is unsolvable.

For each candidate value in that cell, place the value in the cell and try to recursively solve the puzzle.


There are two optimizations which greatly improve the performance of this algorithm:

When choosing a cell, always pick the one with the fewest candidate values. This reduces the branching factor. As values are added to the grid, the number of candidates for other cells reduces too.

When analysing the candidate values for empty cells, it's much quicker to start with the analysis of the previous step and modify it by removing values along the row, column and box of the last-modified cell. This is O(N) in the size of the puzzle, whereas analysing from scratch is O(N3).



In your case an "unsolvable  puzzle" is an invalid matrix. Every element in the matrix will be unique on both axis in a solvable puzzle.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  旧巷少年郎        
                
              
                            
                2021-01-05 01:18
              
            
            
                                                                       
EDIT:  Below is an implementation of the second solution in norok2's answer.

EDIT: we can shuffle the generated square again to make it real random.
So the solve functions can be modified to: 

def solve(numbers):
    shuffle(numbers)
    shift = randint(1, len(numbers)-1)
    res = []

    for r in xrange(len(numbers)):
        res.append(list(numbers))
        numbers = list(numbers[shift:] + numbers[0:shift])

    rows = range(len(numbers))
    shuffle(rows)

    shuffled_res = []
    for i in xrange(len(rows)):
        shuffled_res.append(res[rows[i]])

    return shuffled_res


EDIT: I previously misunderstand the question.
So, here's a 'quick' method which generates a 'to-some-extent' random solutions.
The basic idea is,

    a, b, c
    b, c, a
    c, a, b


We can just move a row of data by a fixed step to form the next row. Which will qualify our restriction.

So, here's the code:

from random import shuffle, randint


def solve(numbers):
    shuffle(numbers)
    shift = randint(1, len(numbers)-1)
    res = []

    for r in xrange(len(numbers)):
        res.append(list(numbers))
        numbers = list(numbers[shift:] + numbers[0:shift])

    return res


def check(arr):
    for c in xrange(len(arr)):
        col = [arr[r][c] for r in xrange(len(arr))]
        if len(set(col)) != len(col):
            return False
    return True


if __name__ == '__main__':
    from pprint import pprint
    res = solve(range(5))
    pprint(res)
    print check(res)




This is a possible solution by itertools, if you don't insist on using numpy which I'm not familiar with:

import itertools
from random import randint
list(itertools.permutations(range(1, 6)))[randint(0, len(range(1, 6))]

# itertools returns a iterator of all possible permutations of the given list.

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  刺人心        
                
              
                            
                2021-01-05 01:22
              
            
            
                                                                       
Can't type code from the phone, here's the pseudocode:


Create a matrix with one diamention more than tge target matrix(3 d) 
Initialize the 25 elements with numbers from 1 to 5
Iterate over the 25 elements.
Choose a random value for the first element from the element list(which contains numbers 1 through 5)
Remove the randomly chosen value from all the elements in its row and column.
Repeat steps 4 and 5 for all the elements. 

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复