Fitting a gamma distribution with (python) Scipy

后端 未结 5 1617
情歌与酒
情歌与酒 2020-12-09 02:24

Can anyone help me out in fitting a gamma distribution in python? Well, I\'ve got some data : X and Y coordinates, and I want to find the gamma parameters that fit this dis

相关标签:
5条回答
  • 2020-12-09 02:50

    1): the "data" variable could be in the format of a python list or tuple, or a numpy.ndarray, which could be obtained by using:

    data=numpy.array(data)
    

    where the 2nd data in the above line should be a list or a tuple, containing your data.

    2: the "parameter" variable is a first guess you could optionally provide to the fitting function as a starting point for the fitting process, so it could be omitted.

    3: a note on @mondano's answer. The usage of moments (mean and variances) to work out the gamma parameters are reasonably good for large shape parameters (alpha>10), but could yield poor results for small values of alpha (See Statistical methods in the atmospheric scineces by Wilks, and THOM, H. C. S., 1958: A note on the gamma distribution. Mon. Wea. Rev., 86, 117–122.

    Using Maximum Likelihood Estimators, as that implemented in the scipy module, is regarded a better choice in such cases.

    0 讨论(0)
  • 2020-12-09 03:01

    OpenTURNS has a simple way to do this with the GammaFactory class.

    First, let's generate a sample:

    import openturns as ot
    gammaDistribution = ot.Gamma()
    sample = gammaDistribution.getSample(100)
    

    Then fit a Gamma to it:

    distribution = ot.GammaFactory().build(sample)
    

    Then we can draw the PDF of the Gamma:

    import openturns.viewer as otv
    otv.View(distribution.drawPDF())
    

    which produces:

    More details on this topic at: http://openturns.github.io/openturns/latest/user_manual/_generated/openturns.GammaFactory.html

    0 讨论(0)
  • 2020-12-09 03:02

    Generate some gamma data:

    import scipy.stats as stats    
    alpha = 5
    loc = 100.5
    beta = 22
    data = stats.gamma.rvs(alpha, loc=loc, scale=beta, size=10000)    
    print(data)
    # [ 202.36035683  297.23906376  249.53831795 ...,  271.85204096  180.75026301
    #   364.60240242]
    

    Here we fit the data to the gamma distribution:

    fit_alpha, fit_loc, fit_beta=stats.gamma.fit(data)
    print(fit_alpha, fit_loc, fit_beta)
    # (5.0833692504230008, 100.08697963283467, 21.739518937816108)
    
    print(alpha, loc, beta)
    # (5, 100.5, 22)
    
    0 讨论(0)
  • 2020-12-09 03:05

    I was unsatisfied with the ss.gamma.rvs-function as it can generate negative numbers, something the gamma-distribution is supposed not to have. So I fitted the sample through expected value = mean(data) and variance = var(data) (see wikipedia for details) and wrote a function that can yield random samples of a gamma distribution without scipy (which I found hard to install properly, on a sidenote):

    import random
    import numpy
    
    data = [6176, 11046, 670, 6146, 7945, 6864, 767, 7623, 7212, 9040, 3213, 6302, 10044, 10195, 9386, 7230, 4602, 6282, 8619, 7903, 6318, 13294, 6990, 5515, 9157]
    
    # Fit gamma distribution through mean and average
    mean_of_distribution = numpy.mean(data)
    variance_of_distribution = numpy.var(data)
    
    def gamma_random_sample(mean, variance, size):
        """Yields a list of random numbers following a gamma distribution defined by mean and variance"""
        g_alpha = mean*mean/variance
        g_beta = mean/variance
        for i in range(size):
            yield random.gammavariate(g_alpha,1/g_beta)
    
    # force integer values to get integer sample
    grs = [int(i) for i in gamma_random_sample(mean_of_distribution,variance_of_distribution,len(data))]
    
    print("Original data: ", sorted(data))
    print("Random sample: ", sorted(grs))
    
    # Original data: [670, 767, 3213, 4602, 5515, 6146, 6176, 6282, 6302, 6318, 6864, 6990, 7212, 7230, 7623, 7903, 7945, 8619, 9040, 9157, 9386, 10044, 10195, 11046, 13294]
    # Random sample:  [1646, 2237, 3178, 3227, 3649, 4049, 4171, 5071, 5118, 5139, 5456, 6139, 6468, 6726, 6944, 7050, 7135, 7588, 7597, 7971, 10269, 10563, 12283, 12339, 13066]
    
    0 讨论(0)
  • 2020-12-09 03:08

    If you want a long example including a discussion about estimating or fixing the support of the distribution, then you can find it in https://github.com/scipy/scipy/issues/1359 and the linked mailing list message.

    Preliminary support to fix parameters, such as location, during fit has been added to the trunk version of scipy.

    0 讨论(0)
提交回复
热议问题