An algorithm for randomly generating integer partitions of a particular length, in Python?

放肆的年华 提交于 2019-11-30 05:27:30

问题


I've been using the random_element() function provided by SAGE to generate random integer partitions for a given integer (N) that are a particular length (S). I'm trying to generate unbiased random samples from the set of all partitions for given values of N and S. SAGE's function quickly returns random partitions for N (i.e. Partitions(N).random_element()).

However, it slows immensely when adding S (i.e. Partitions(N,length=S).random_element()). Likewise, filtering out random partitions of N that are of length S is incredibly slow.

However, and I hope this helps someone, I've found that in the case when the function returns a partition of N not matching the length S, that the conjugate partition is often of length S. That is:

S = 10
N = 100
part = list(Partitions(N).random_element())
    if len(part) != S:
        SAD = list(Partition(part).conjugate())
        if len(SAD) != S:
            continue

This increases the rate at which partitions of length S are found and appears to produce unbiased samples (I've examined the results against entire sets of partitions for various values of N and S).

However, I'm using values of N (e.g. 10,000) and S (e.g. 300) that make even this approach impractically slow. The comment associated with SAGE's random_element() function admits there is plenty of room for optimization. So, is there a way to more quickly generate unbiased (i.e. random uniform) samples of integer partitions matching given values of N and S, perhaps, by not generating partitions that do not match S? Additionally, using conjugate partitions works well in many cases to produce unbiased samples, but I can't say that I precisely understand why.


回答1:


Finally, I have a definitively unbiased method that has a zero rejection rate. Of course, I've tested it to make sure the results are representative samples of entire feasible sets. It's very fast and totally unbiased. Enjoy.

from sage.all import *
import random

First, a function to find the smallest maximum addend for a partition of n with s parts

def min_max(n,s):

    _min = int(floor(float(n)/float(s)))
    if int(n%s) > 0:
        _min +=1

    return _min

Next, A function that uses a cache and memoiziation to find the number of partitions of n with s parts having x as the largest part. This is fast, but I think there's a more elegant solution to be had. e.g., Often: P(N,S,max=K) = P(N-K,S-1) Thanks to ante (https://stackoverflow.com/users/494076/ante) for helping me with this: Finding the number of integer partitions given a total, a number of parts, and a maximum summand

D = {}
def P(n,s,x):
    if n > s*x or x <= 0: return 0
    if n == s*x: return 1
    if (n,s,x) not in D:
        D[(n,s,x)] = sum(P(n-i*x, s-i, x-1) for i in xrange(s))
    return D[(n,s,x)]

Finally, a function to find uniform random partitions of n with s parts, with no rejection rate! Each randomly chosen number codes for a specific partition of n having s parts.

def random_partition(n,s):
    S = s
    partition = []
    _min = min_max(n,S)
    _max = n-S+1

    total = number_of_partitions(n,S)
    which = random.randrange(1,total+1) # random number

    while n:
        for k in range(_min,_max+1):
            count = P(n,S,k)
            if count >= which:
                count = P(n,S,k-1)
                break

        partition.append(k)
        n -= k
        if n == 0: break
        S -= 1
        which -= count
        _min = min_max(n,S)
        _max = k

    return partition



回答2:


Simple approach: randomly assign the integers:

def random_partition(n, s):
    partition = [0] * s
    for x in range(n):
        partition[random.randrange(s)] += 1
    return partition



回答3:


I ran into a similar problem when I was trying to calculate the probability of the strong birthday problem.

First off, the partition function explodes when given only modest amount of numbers. You'll be returning a LOT of information. No matter which method you're using N = 10000 and S = 300 will generate ridiculous amounts of data. It will be slow. Chances are any pure python implementation you use will be equally slow or slower. Look to making a CModule.

If you want to try python the approach I took as a combination of itertools and generators to keep memory usage down. I don't seem to have my code handy anymore, but here's a good impementation:

http://wordaligned.org/articles/partitioning-with-python

EDIT:

Found my code:

def partition(a, b=-1, limit=365):
  if (b == -1):
    b = a
  if (a == 2 or a == 3):
    if (b >= a and limit):
      yield [a]
    else:
      return
  elif (a > 3):
    if (a <= b):
      yield [a]
    c = 0
    if b > a-2:
      c = a-2
    else:
      c = b
    for i in xrange(c, 1, -1):
      if (limit):
        for j in partition(a-i, i, limit-1):
          yield [i] + j


来源:https://stackoverflow.com/questions/10287021/an-algorithm-for-randomly-generating-integer-partitions-of-a-particular-length

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!