Fitting empirical distribution to theoretical ones with Scipy (Python)?

前端 未结 9 726
醉话见心
醉话见心 2020-11-22 05:28

INTRODUCTION: I have a list of more than 30,000 integer values ranging from 0 to 47, inclusive, e.g.[0,0,0,0,..,1,1,1,1,...,2,2,2,2,...,47,47,47,...]<

9条回答
  •  太阳男子
    2020-11-22 05:45

    AFAICU, your distribution is discrete (and nothing but discrete). Therefore just counting the frequencies of different values and normalizing them should be enough for your purposes. So, an example to demonstrate this:

    In []: values= [0, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 3, 3, 4]
    In []: counts= asarray(bincount(values), dtype= float)
    In []: cdf= counts.cumsum()/ counts.sum()
    

    Thus, probability of seeing values higher than 1 is simply (according to the complementary cumulative distribution function (ccdf):

    In []: 1- cdf[1]
    Out[]: 0.40000000000000002
    

    Please note that ccdf is closely related to survival function (sf), but it's also defined with discrete distributions, whereas sf is defined only for contiguous distributions.

提交回复
热议问题