Efficient way of generating latin squares (or randomly permute numbers in matrix uniquely on both axes - using NumPy)

前端 未结 5 1615
悲&欢浪女
悲&欢浪女 2021-01-05 00:31

For example, if there are 5 numbers 1, 2, 3, 4, 5

I want a random result like

[[ 2, 3, 1, 4, 5]
 [ 5, 1, 2, 3, 4]
 [ 3, 2, 4, 5, 1]
 [ 1, 4, 5, 2, 3]         


        
相关标签:
5条回答
  • 2021-01-05 00:59

    I experimented with a brute-force random choice. Generate a row, and if valid, add to the accumulated lines:

    def foo(n=5,maxi=200):
        arr = np.random.choice(numbers,n, replace=False)[None,:]
        for i in range(maxi):
            row = np.random.choice(numbers,n, replace=False)[None,:]
            if (arr==row).any(): continue
            arr = np.concatenate((arr, row),axis=0)
            if arr.shape[0]==n: break
        print(i)
        return arr
    

    Some sample runs:

    In [66]: print(foo())
    199
    [[1 5 4 2 3]
     [4 1 5 3 2]
     [5 3 2 1 4]
     [2 4 3 5 1]]
    In [67]: print(foo())
    100
    [[4 2 3 1 5]
     [1 4 5 3 2]
     [5 1 2 4 3]
     [3 5 1 2 4]
     [2 3 4 5 1]]
    In [68]: print(foo())
    57
    [[1 4 5 3 2]
     [2 1 3 4 5]
     [3 5 4 2 1]
     [5 3 2 1 4]
     [4 2 1 5 3]]
    In [69]: print(foo())
    174
    [[2 1 5 4 3]
     [3 4 1 2 5]
     [1 3 2 5 4]
     [4 5 3 1 2]
     [5 2 4 3 1]]
    In [76]: print(foo())
    41
    [[3 4 5 1 2]
     [1 5 2 3 4]
     [5 2 3 4 1]
     [2 1 4 5 3]
     [4 3 1 2 5]]
    

    The required number of tries varies all over the place, with some exceeding my iteration limit.

    Without getting into any theory, there's going to be difference between quickly generating a 2d permutation, and generating one that is in some sense or other, maximally random. I suspect my approach is closer to this random goal than a more systematic and efficient approach (but I can't prove it).


    def opFoo():
        numbers = list(range(1,6))
        result = np.zeros((5,5), dtype='int32')
        row_index = 0; i = 0
        while row_index < 5:
            np.random.shuffle(numbers)
            for column_index, number in enumerate(numbers):
                if number in result[:, column_index]:
                    break
                else:
                    result[row_index, :] = numbers
                    row_index += 1
            i += 1
        return i, result
    
    In [125]: opFoo()
    Out[125]: 
    (11, array([[2, 3, 1, 5, 4],
            [4, 5, 1, 2, 3],
            [3, 1, 2, 4, 5],
            [1, 3, 5, 4, 2],
            [5, 3, 4, 2, 1]]))
    

    Mine is quite a bit slower than the OP's, but mine is correct.


    This is an improvement on mine (2x faster):

    def foo1(n=5,maxi=300):
        numbers = np.arange(1,n+1)
        np.random.shuffle(numbers)
        arr = numbers.copy()[None,:]
        for i in range(maxi):
            np.random.shuffle(numbers)
            if (arr==numbers).any(): continue
            arr = np.concatenate((arr, numbers[None,:]),axis=0)
            if arr.shape[0]==n: break
        return arr, i
    

    Why is translated Sudoku solver slower than original?

    I found that with this translation of Java Sudoku solver, that using Python lists was faster than numpy arrays.

    I may try to adapt that script to this problem - tomorrow.

    0 讨论(0)
  • 2021-01-05 01:08

    Just for your information, what you are looking for is a way of generating latin squares. As for the solution, it depends on how much random "random" is for you.

    I would devise at least four main techniques, two of which have been already proposed. Hence, I will briefly describe the other two:

    1. loop through all possible permutations of the items and accept the first that satisfy the unicity constraint along rows
    2. use only cyclic permutations to build subsequent rows: these are by construction satisfying the unicity constraint along rows (the cyclic transformation can be done forward or backward); for improved "randomness" the rows can be shuffled

    Assuming we work with standard Python data types since I do not see a real merit in using NumPy (but results can be easily converted to np.ndarray if necessary), this would be in code (the first function is just to check that the solution is actually correct):

    import random
    import math
    import itertools
    
    # this only works for Iterable[Iterable]
    def is_latin_rectangle(rows):
        valid = True
        for row in rows:
            if len(set(row)) < len(row):
                valid = False
        if valid and rows:
            for i, val in enumerate(rows[0]):
                col = [row[i] for row in rows]
                if len(set(col)) < len(col):
                    valid = False
                    break
        return valid
    
    def is_latin_square(rows):
        return is_latin_rectangle(rows) and len(rows) == len(rows[0])
    
    # : prepare the input
    n = 9
    items = list(range(1, n + 1))
    # shuffle items
    random.shuffle(items)
    # number of permutations
    print(math.factorial(n))
    
    
    def latin_square1(items, shuffle=True):
        result = []
        for elems in itertools.permutations(items):
            valid = True
            for i, elem in enumerate(elems):
                orthogonals = [x[i] for x in result] + [elem]
                if len(set(orthogonals)) < len(orthogonals):
                    valid = False
                    break
            if valid:
                result.append(elems)
        if shuffle:
            random.shuffle(result)
        return result
    
    rows1 = latin_square1(items)
    for row in rows1:
        print(row)
    print(is_latin_square(rows1))
    
    
    def latin_square2(items, shuffle=True, forward=False):
        sign = -1 if forward else 1
        result = [items[sign * i:] + items[:sign * i] for i in range(len(items))]
        if shuffle:
            random.shuffle(result)
        return result
    
    rows2 = latin_square2(items)
    for row in rows2:
        print(row)
    print(is_latin_square(rows2))
    
    rows2b = latin_square2(items, False)
    for row in rows2b:
        print(row)
    print(is_latin_square(rows2b))
    

    For comparison, an implementation by trying random permutations and accepting valid ones (fundamentally what @hpaulj proposed) is also presented.

    def latin_square3(items):
        result = [list(items)]
        while len(result) < len(items):
            new_row = list(items)
            random.shuffle(new_row)
            result.append(new_row)
            if not is_latin_rectangle(result):
                result = result[:-1]
        return result
    
    rows3 = latin_square3(items)
    for row in rows3:
        print(row)
    print(is_latin_square(rows3))
    

    I did not have time (yet) to implement the other method (with backtrack Sudoku-like solutions from @ConfusedByCode).

    With timings for n = 5:

    %timeit latin_square1(items)
    321 µs ± 24.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    
    %timeit latin_square2(items)
    7.5 µs ± 222 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
    
    %timeit latin_square2(items, False)
    2.21 µs ± 69.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
    
    %timeit latin_square3(items)
    2.15 ms ± 102 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
    

    ... and for n = 9:

    %timeit latin_square1(items)
    895 ms ± 18.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
    
    %timeit latin_square2(items)
    12.5 µs ± 200 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
    
    %timeit latin_square2(items, False)
    3.55 µs ± 55.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
    
    %timeit latin_square3(items)
    The slowest run took 36.54 times longer than the fastest. This could mean that an intermediate result is being cached.
    9.76 s ± 9.23 s per loop (mean ± std. dev. of 7 runs, 1 loop each)
    

    So, solution 1 is giving a fair deal of randomness but it is not terribly fast (and scale with O(n!)), solution 2 (and 2b) are much faster (scaling with O(n)) but not as random as solution 1. Solution 3 is very slow and the performance can vary significantly (can probably be sped up by letting the last iteration be computed instead of guessed).

    Getting more technical, other efficient algorithms are discussed in:

    • Jacobson, M. T. and Matthews, P. (1996), Generating uniformly distributed random latin squares. J. Combin. Designs, 4: 405-437. doi:10.1002/(SICI)1520-6610(1996)4:6<405::AID-JCD3>3.0.CO;2-J
    0 讨论(0)
  • 2021-01-05 01:17

    This may seem odd, but you have basically described generating a random n-dimension Sudoku puzzle. From a blog post by Daniel Beer:

    The basic approach to solving a Sudoku puzzle is by a backtracking search of candidate values for each cell. The general procedure is as follows:

    1. Generate, for each cell, a list of candidate values by starting with the set of all possible values and eliminating those which appear in the same row, column and box as the cell being examined.

    2. Choose one empty cell. If none are available, the puzzle is solved.

    3. If the cell has no candidate values, the puzzle is unsolvable.

    4. For each candidate value in that cell, place the value in the cell and try to recursively solve the puzzle.

    There are two optimizations which greatly improve the performance of this algorithm:

    1. When choosing a cell, always pick the one with the fewest candidate values. This reduces the branching factor. As values are added to the grid, the number of candidates for other cells reduces too.

    2. When analysing the candidate values for empty cells, it's much quicker to start with the analysis of the previous step and modify it by removing values along the row, column and box of the last-modified cell. This is O(N) in the size of the puzzle, whereas analysing from scratch is O(N3).

    In your case an "unsolvable puzzle" is an invalid matrix. Every element in the matrix will be unique on both axis in a solvable puzzle.

    0 讨论(0)
  • 2021-01-05 01:18

    EDIT: Below is an implementation of the second solution in norok2's answer.

    EDIT: we can shuffle the generated square again to make it real random. So the solve functions can be modified to:

    def solve(numbers):
        shuffle(numbers)
        shift = randint(1, len(numbers)-1)
        res = []
    
        for r in xrange(len(numbers)):
            res.append(list(numbers))
            numbers = list(numbers[shift:] + numbers[0:shift])
    
        rows = range(len(numbers))
        shuffle(rows)
    
        shuffled_res = []
        for i in xrange(len(rows)):
            shuffled_res.append(res[rows[i]])
    
        return shuffled_res
    

    EDIT: I previously misunderstand the question. So, here's a 'quick' method which generates a 'to-some-extent' random solutions. The basic idea is,

        a, b, c
        b, c, a
        c, a, b
    

    We can just move a row of data by a fixed step to form the next row. Which will qualify our restriction.

    So, here's the code:

    from random import shuffle, randint
    
    
    def solve(numbers):
        shuffle(numbers)
        shift = randint(1, len(numbers)-1)
        res = []
    
        for r in xrange(len(numbers)):
            res.append(list(numbers))
            numbers = list(numbers[shift:] + numbers[0:shift])
    
        return res
    
    
    def check(arr):
        for c in xrange(len(arr)):
            col = [arr[r][c] for r in xrange(len(arr))]
            if len(set(col)) != len(col):
                return False
        return True
    
    
    if __name__ == '__main__':
        from pprint import pprint
        res = solve(range(5))
        pprint(res)
        print check(res)
    

    This is a possible solution by itertools, if you don't insist on using numpy which I'm not familiar with:

    import itertools
    from random import randint
    list(itertools.permutations(range(1, 6)))[randint(0, len(range(1, 6))]
    
    # itertools returns a iterator of all possible permutations of the given list.
    
    0 讨论(0)
  • 2021-01-05 01:22

    Can't type code from the phone, here's the pseudocode:

    1. Create a matrix with one diamention more than tge target matrix(3 d)

    2. Initialize the 25 elements with numbers from 1 to 5

    3. Iterate over the 25 elements.

    4. Choose a random value for the first element from the element list(which contains numbers 1 through 5)

    5. Remove the randomly chosen value from all the elements in its row and column.

    6. Repeat steps 4 and 5 for all the elements.

    0 讨论(0)
提交回复
热议问题