Python “Stars and Bars”

瘦欲@ 提交于 2021-01-28 03:28:59

问题


I am trying to get all possible ways to distribute n candies among k children. For instance, according to stars-and-bars formula, number of ways to distribute 96 candies among 5 children is 100! / (96!*4!) = 3 921 225 tuples of all possible permutations of size 5.

list2 = [item for item in it.product(range(97), repeat = 5)
             if sum(item) == 96]

My PC seems to overwhelmed by complexity. Each tuple consumes 24*5 = 120 bytes of memory. This results in 921 225 * 120 = 470547000 bytes or 450 mb. Doesn't seem that much. Why is PC so slow at generating this list? What am I missing?


回答1:


Here is one way to make your approach work. It uses itertools.combinations. It takes a few seconds to build the complete list. For a faster, numpy based approach see bottom of this post.

It works by enumerating all combinations of four bars between 1 and 100, always adding the outer bars 0 and 101. The allocations for the five kids are then what's between the bars, i.e. the diff of the bars minus one.

import numpy as np
import itertools


bars = [0, 0, 0, 0, 0, 101]
result = [[bars[j+1] - bars[j] - 1 for j in range(5)] for bars[1:-1] in itertools.combinations(range(1, 101), 4)]

# sanity check
len(result)
# 3921225
# show few samples
from pprint import pprint
pprint(result[::400000])
# [[0, 0, 0, 0, 96],
#  [2, 26, 12, 8, 48],
#  [5, 17, 22, 7, 45],
#  [8, 23, 30, 16, 19],
#  [12, 2, 73, 9, 0],
#  [16, 2, 25, 40, 13],
#  [20, 29, 24, 0, 23],
#  [26, 13, 34, 14, 9],
#  [33, 50, 4, 5, 4],
#  [45, 21, 26, 1, 3]]

Why does yours not work so well? I think mostly because your loop is a bit wasteful, 97^5 is quite a bit larger than 100 choose 4.

If you want it really fast, you can replace itertools.combinations with a numpy version:

https://stackoverflow.com/a/42202157/7207392

def fast_comb(n, k):
    a = np.ones((k, n-k+1), dtype=int)
    a[0] = np.arange(n-k+1)
    for j in range(1, k):
        reps = (n-k+j) - a[j-1]
        a = np.repeat(a, reps, axis=1)
        ind = np.add.accumulate(reps)
        a[j, ind[:-1]] = 1-reps[1:]
        a[j, 0] = j
        a[j] = np.add.accumulate(a[j])
    return a

fb = fast_comb(100, 4)
sb = np.empty((6, fb.shape[1]), int)
sb[0], sb[1:5], sb[5] = -1, fb, 100
result = np.diff(sb.T) - 1

result.shape
# (3921225, 5)
result[::400000]
# array([[ 0,  0,  0,  0, 96],
#        [ 2, 26, 12,  8, 48],
#        [ 5, 17, 22,  7, 45],
#        [ 8, 23, 30, 16, 19],
#        [12,  2, 73,  9,  0],
#        [16,  2, 25, 40, 13],
#        [20, 29, 24,  0, 23],
#        [26, 13, 34, 14,  9],
#        [33, 50,  4,  5,  4],
#        [45, 21, 26,  1,  3]])

This takes about one second.




回答2:


I see two problems with your math.

First, you're describing a combination there. Effectively, you're thinking (96 choose 5), which doesn't cover all permutations.

Second, the permutation would actually be 96!/91!, which is several orders of magnitude higher than ~4 million.

Just by adding the byte count, you're in the high gigabyte range of memory usage now, which could explain why your machine is slowing down; the memory usage alone generated from this could crush most modern consumer machines.



来源:https://stackoverflow.com/questions/53561814/python-stars-and-bars

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!