I want to generate a list of random distribution of numbers so their sum would be equal to a randomly chosen number. For example, if randomly chosen number is 5, the distrib
Let n
be the number you want values to add up to. Generate a random sample
of random size (less than n
), consisting of values in the range 1 to n
exclusive of n
. Now add the endpoints 0 and n
, and sort. Successive differences of the sorted values will sum to n
.
import random as r
def random_sum_to(n):
a = r.sample(range(1, n), r.randint(1, n-1)) + [0, n]
list.sort(a)
return [a[i+1] - a[i] for i in range(len(a) - 1)]
print(random_sum_to(20)) # yields, e.g., [4, 1, 1, 2, 4, 2, 2, 4]
If you'd like to be able to specify the number of terms in the sum explicitly, or have it be random if unspecified, add an optional argument:
import random as r
def random_sum_to(n, num_terms = None):
num_terms = (num_terms or r.randint(2, n)) - 1
a = r.sample(range(1, n), num_terms) + [0, n]
list.sort(a)
return [a[i+1] - a[i] for i in range(len(a) - 1)]
print(random_sum_to(20, 3)) # [9, 7, 4] for example
print(random_sum_to(5)) # [1, 1, 2, 1] for example
In a loop, you could keep drawing a random number between 1 and the remaining sum until you've reached your total
from random import randint
def generate_values(n):
values = []
while n > 0:
value = randint(1, n)
values.append(value)
n -= value
return values
A few samples of such a function
>>> generate_values(20)
[17, 1, 1, 1]
>>> generate_values(20)
[10, 4, 4, 1, 1]
>>> generate_values(20)
[14, 4, 1, 1]
>>> generate_values(20)
[5, 2, 4, 1, 5, 1, 1, 1]
>>> generate_values(20)
[2, 13, 5]
>>> generate_values(20)
[14, 3, 2, 1]
Consider doing it continuously first. And for a moment we do not care about final number, so let's sample uniformly X_i in the interval [0...1] so that their sum is equal to 1
X_1 + X_2 + ... X_n = 1
This is well-known distribution called Dirichlet Distribution, or gamma variate, or simplex sampling. See details and discussion at Generating N uniform random numbers that sum to M. One can use random.gammavariate(a,1)
or for correct handling of corners gamma variate with parameter 1 is equivalent exponential distribution, with direct sampling code below
def simplex_sampling(n):
r = []
sum = 0.0
for k in range(0,n):
x = random.random()
if x == 0.0:
return (1.0, make_corner_sample(n, k))
t = -math.log(x)
r.append(t)
sum += t
return (sum, r)
def make_corner_sample(n, k):
r = []
for i in range(0, n):
if i == k:
r.append(1.0)
else:
r.append(0.0)
return r
So from simplex_sampling
you have vector and the sum to be used as normalization.
Thus, to use it for, say, N=5
N = 5
sum, r = simplex_sampling(N)
norm = float(N)/sum
# normalization together with matching back to integers
result = []
for k in range(N):
# t is now float uniformly distributed in [0.0...N], with sum equal to N
t = r[k] * norm
# not sure if you could have zeros,
# and check for boundaries might be useful, but
# conversion to integers is trivial anyway:
# values in [0...1) shall be converted to 0,
# values in [1...2) shall be converted to 1, etc
result.append( int(t) )