I am new to using Statistical functions in xls. I am able to the KURT function in xls to calculate the Kurtosis or Skewness, given a set of numbers.
But my requirement is to do it in the other way, like for a given Skewness or Kurtosis, is there a way to generate random numbers. Any pointers on how to do that.
The function should take the skewness or Kurtosis value as input, and it should generate 50 random numbers with 1 being minimum and 100,000 being maximum.
If Excel does not have a way, I am looking for suggestions in Python.
Can you please help me how to do this in Excel or Python?
After experimenting with several distributions, the generalised Gamma distribution seems to be flexible enough to adjust either the skew or the kurtosis to the desired value, but not both at the same time like what was asked in the question @gabriel mentioned in his comment.
So to draw a sample out of a g-Gamma distribution with a single fixed moment, you can use scipy.optimize
to find a distribution with minimizes a penalty function (I chose (target - value) ** 2
)
from scipy import stats, optimize
import numpy as np
def random_by_moment(moment, value, size):
""" Draw `size` samples out of a generalised Gamma distribution
where a given moment has a given value """
assert moment in 'mvsk', "'{}' invalid moment. Use 'm' for mean,"\
"'v' for variance, 's' for skew and 'k' for kurtosis".format(moment)
def gengamma_error(a):
m, v, s, k = (stats.gengamma.stats(a[0], a[1], moments="mvsk"))
moments = {'m': m, 'v': v, 's': s, 'k': k}
return (moments[moment] - value) ** 2 # has its minimum at the desired value
a, c = optimize.minimize(gengamma_error, (1, 1)).x
return stats.gengamma.rvs(a, c, size=size)
n = random_by_moment('k', 3, 100000)
# test if result is correct
print("mean={}, var={}, skew={}, kurt={}".format(np.mean(n), np.var(n), stats.skew(n), stats.kurtosis(n)))
Before that I came up with a function that matches skew and kurtosis. However even the g-Gamma is not flexible enough to serve this purpose depending on how extreme your conditions are
def random_by_sk(skew, kurt, size):
def gengamma_error(a):
s, k = (stats.gengamma.stats(a[0], a[1], moments="sk"))
return (s - skew) ** 2 + (k - kurt) ** 2 # penalty equally weighted for skew and kurtosis
a, c = optimize.minimize(gengamma_error, (1, 1)).x
return stats.gengamma.rvs(a, c, size=size)
n = random_by_sk(3, 3, 100000)
print("mean={}, var={}, skew={}, kurt={}".format(np.mean(n), np.var(n), stats.skew(n), stats.kurtosis(n)))
# will yield skew ~2 and kurtosis ~3 instead of 3, 3
来源:https://stackoverflow.com/questions/36898729/generating-numbers-distribution-for-a-given-kurtosis-or-skewness