I would like to generate n random numbers e.g., n=200
, where the range of possible values is between 2 and 40 with a mean of 12 and median is 6.5.
I searche
If you have a bunch of smaller arrays with the right median and mean, you can combine them to produce a larger array.
So... you can pre-generate smaller arrays as you are currently doing, and then combine them randomly for larger n. Of course, this will result in a biased random sample, but it sounds like you just want something that's approximately random.
Here's working (py3) code that generates a sample of size 5000 with your desired properties, which it build from smaller samples of size 4, 6, 8, 10, ..., 18.
Note, that I changed how the smaller random samples are built: half of the numbers must be <= 6 and half >= 7 if the median is to be 6.5, so we generate those halves independently. This speeds things up massively.
import collections
import numpy as np
import random
rs = collections.defaultdict(list)
for i in range(50):
n = random.randrange(4, 20, 2)
while True:
x=np.append(np.random.randint(2, 7, size=n//2), np.random.randint(7, 41, size=n//2))
if x.mean() == 12 and np.median(x) == 6.5:
break
rs[len(x)].append(x)
def random_range(n):
if n % 2:
raise AssertionError("%d must be even" % n)
r = []
while n:
i = random.randrange(4, min(20, n+1), 2)
# Don't be left with only 2 slots left.
if n - i == 2: continue
xs = random.choice(rs[i])
r.extend(xs)
n -= i
random.shuffle(r)
return r
xs = np.array(random_range(5000))
print([(i, list(xs).count(i)) for i in range(2, 41)])
print(len(xs))
print(xs.mean())
print(np.median(xs))
Output:
[(2, 620), (3, 525), (4, 440), (5, 512), (6, 403), (7, 345), (8, 126), (9, 111), (10, 78), (11, 25), (12, 48), (13, 61), (14, 117), (15, 61), (16, 62), (17, 116), (18, 49), (19, 73), (20, 88), (21, 48), (22, 68), (23, 46), (24, 75), (25, 77), (26, 49), (27, 83), (28, 61), (29, 28), (30, 59), (31, 73), (32, 51), (33, 113), (34, 72), (35, 33), (36, 51), (37, 44), (38, 25), (39, 38), (40, 46)]
5000
12.0
6.5
The first line of the output shows that there's 620 2's, 52 3's, 440 4's etc. in the final array.