Generate all possible outcomes of k balls in n bins (sum of multinomial / categorical outcomes)

后端 未结 4 1874
被撕碎了的回忆
被撕碎了的回忆 2021-01-04 08:36

Suppose we have n bins in which we are throwing k balls. What is a fast (i.e. using numpy/scipy instead of python code) way to gen

相关标签:
4条回答
  • 2021-01-04 08:44

    Here's a generator solution using itertools.combinations_with_replacement, don't know if it will be suitable for your needs.

    def partitions(n, b):
        masks = numpy.identity(b, dtype=int)
        for c in itertools.combinations_with_replacement(masks, n): 
            yield sum(c)
    
    output = numpy.array(list(partitions(3, 4)))
    # [[3 0 0 0]
    #  [2 1 0 0]
    #  ...
    #  [0 0 1 2]
    #  [0 0 0 3]]
    

    The complexity of this function grows exponentially, so there is a discrete boundary between what is feasible and what is not.

    Note that while numpy arrays need to know their size at construction, this is easily possible since the multiset number is easily found. Below might be a better method, I have done no timings.

    from math import factorial as fact
    from itertools import combinations_with_replacement as cwr
    
    nCr = lambda n, r: fact(n) / fact(n-r) / fact(r)
    
    def partitions(n, b):
        partition_array = numpy.empty((nCr(n+b-1, b-1), b), dtype=int)
        masks = numpy.identity(b, dtype=int)
        for i, c in enumerate(cwr(masks, n)): 
            partition_array[i,:] = sum(c)
        return partition_array
    
    0 讨论(0)
  • 2021-01-04 08:48

    here is a naive implementation with list comprehensions, not sure about performance compared to numpy

    def gen(n,k):
        if(k==1):
            return [[n]]
        if(n==0):
            return [[0]*k]
        return [ g2 for x in range(n+1) for g2 in [ u+[n-x] for u in gen(x,k-1) ] ]
    
    > gen(3,4)
    [[0, 0, 0, 3],
     [0, 0, 1, 2],
     [0, 1, 0, 2],
     [1, 0, 0, 2],
     [0, 0, 2, 1],
     [0, 1, 1, 1],
     [1, 0, 1, 1],
     [0, 2, 0, 1],
     [1, 1, 0, 1],
     [2, 0, 0, 1],
     [0, 0, 3, 0],
     [0, 1, 2, 0],
     [1, 0, 2, 0],
     [0, 2, 1, 0],
     [1, 1, 1, 0],
     [2, 0, 1, 0],
     [0, 3, 0, 0],
     [1, 2, 0, 0],
     [2, 1, 0, 0],
     [3, 0, 0, 0]]
    
    0 讨论(0)
  • 2021-01-04 08:50

    For reference purposes, the following code uses Ehrlich's algorithm to iterate through all possible combinations of a multiset in C++, Javascript, and Python:

    https://github.com/ekg/multichoose

    This can be converted to the above format using this method. Specifically,

    for s in multichoose(k, set):
        row = np.bincount(s, minlength=len(set) + 1)
    

    This still isn't pure numpy, but can be used to fill a preallocated numpy.array pretty quickly.

    0 讨论(0)
  • 2021-01-04 08:54

    Here's the solution I came up with for this.

    import numpy, itertools
    def multinomial_combinations(n, k, max_power=None):
        """returns a list (2d numpy array) of all length k sequences of 
        non-negative integers n1, ..., nk such that n1 + ... + nk = n."""
        bar_placements = itertools.combinations(range(1, n+k), k-1)
        tmp = [(0,) + x + (n+k,) for x in bar_placements]
        sequences =  numpy.diff(tmp) - 1
        if max_power:
            return sequences[numpy.where((sequences<=max_power).all(axis=1))][::-1]
        else:
            return sequences[::-1]
    

    Note 1: The [::-1] at the end just reverses the order to match your example output.

    Note 2: Finding these sequences is equivalent to finding all ways to arrange n stars and k-1 bars in (to fill n+k-1 spots) (see stars and bars thm 2).

    Note 3: The max_power argument is to give you the option to return only sequences where all integers are below some max.

    0 讨论(0)
提交回复
热议问题