问题
i want to return 1<l<10
with probability 1/(2^(l-1))
how i should do this rather then:
x = random()
if x < 0.5:
return 2
and so on
thank you
回答1:
This is going to be fun... I am a bit rusty with these things, so a good matematician could fix my reasoning.
To generate a distribution from a formula you need first to do some integrals and calculate the cumulative density function for the specified interval. In particular we need to start to calculate the normalization constant.
This integral gives, for "k":
The "meaning" of the cumulative density function is "what's the probability to obtain a certain number that belong to the interval I need?". This question can be seen in another way: "the probability to take a number that is below or equal to 10 must be 1". This lead to the following equation that help to to find the parameter "C". Note that the first therm is the k, the second therm is the general integral of 2^(1-x) where I have replace x with 10.
Solving this we finally reach the CDF (again, it is possible that the way to find it is easier):
At this point we need to reverse the CDF for X. X is now our random number generator between 0 and 1. The formula is:
In python code I tried the following:
import numpy as np
import matplotlib.pyplot as plt
a=[ 1- np.log2(1-(1-2**(-9))*np.random.rand()) for i in range(10000)]
plt.hist(a, normed=True)
Does it makes sense?
回答2:
While @Fabrizio answer is probably true, there is a lot simpler way to get job done - what you want is truncated exponential, because your PDF looks like
PDF(x) ~ 2-x = e-x log(2).
There is already truncated exponential in the SciPy, take a look here.
Just set proper scale and location, and job is done. Code
import numpy as np
from scipy.stats import truncexpon
import matplotlib.pyplot as plt
vmin = 1.0
vmax = 10.0
scale=1.0/np.log(2.0)
r = truncexpon.rvs(b=(vmax-vmin)/scale, loc=vmin, scale=scale, size=100000)
print(np.min(r))
print(np.max(r))
plt.hist(r, bins=[1,2,3,4,5,6,7,8,9,10], density=True)
Histogram
And if you need to sample only integer values, there is good helper function in Numpy as well, code below, graph is quite similar
#%%
import numpy as np
import matplotlib.pyplot as plt
vmin = 1
vmax = 10
v = np.arange(vmin+1, vmax, dtype=np.int64)
p = np.asarray([1.0/2**(l-1) for l in range(vmin+1, vmax)]) # probabilities
p /= np.sum(p) # normalization
r = np.random.choice(v, size=100000, replace=True, p=p)
print(np.min(r))
print(np.max(r))
plt.hist(r, bins=[1.5,2.5,3.5,4.5,5.5,6.5,7.5,8.5,9.5], density=True)
来源:https://stackoverflow.com/questions/58625669/implementing-specific-distribution-in-python