Estimate confidence intervals for parameters of distribution in python

ぐ巨炮叔叔 提交于 2021-01-21 10:30:20

问题


Is there a built in function that will provide the confidence intervals for parameter estimates in a python package or is this something I will need to implement by hand? I am looking for something similar to matlabs gevfit http://www.mathworks.com/help/stats/gevfit.html.


回答1:


Take a look at scipy and numpy in case you haven't already. If you have some familiarity with MATLAB, then the switch should be relatively easy. I've taken this quick snippet from this SO response:

import numpy as np
import scipy as sp
import scipy.stats

def mean_confidence_interval(data, confidence=0.95):
    a = 1.0*np.array(data)
    n = len(a)
    m, se = np.mean(a), scipy.stats.sem(a)
    h = se * sp.stats.t.ppf((1+confidence)/2., n-1)
    return m, m-h, m+h

You should be able to customize the returns to your liking. Like the MATLAB gevfit function, it defaults to using 95% confidence bounds.




回答2:


The bootstrap can be used to estimate confidence intervals of any function (np.mean, st.genextreme.fit, etc.) of a sample, and there is a Python library: scikits.bootstrap.

Here for the data from the question author's related question:

import numpy as np, scipy.stats as st, scikits.bootstrap as boot
data = np.array([ 22.20379411,  22.99151292,  24.27032696,  24.82180626,
  25.23163221,  25.39987272,  25.54514567,  28.56710007,
  29.7575898 ,  30.15641696,  30.79168255,  30.88147532,
  31.0236419 ,  31.17380647,  31.61932755,  32.23452568,
  32.76262978,  33.39430032,  33.81080069,  33.90625861,
  33.99142006,  35.45748368,  37.0342621 ,  37.14768791,
  38.14350221,  42.72699534,  44.16449992,  48.77736737,
  49.80441736,  50.57488779])

st.genextreme.fit(data)   # just to check the parameters
boot.ci(data, st.genextreme.fit)

The results are

(-0.014387281261850815, 29.762126238637851, 5.8983127779873605)
array([[ -0.40002507,  26.93511496,   4.6677834 ],
       [  0.19743722,  32.41834882,   9.05026202]])

The bootstrap takes about three minutes on my machine; by default, boot.ci uses 10,000 bootstrap iterations (n_samples), see code or help(boot.ci), and st.genextreme.fit is not superfast.

The confidence intervals from boot.ci do not match the ones from MATLAB's gevfit exactly. E.g., MATLAB gives a symmetric one [-0.3032, 0.3320] for the first parameter (0.0144).



来源:https://stackoverflow.com/questions/31481279/estimate-confidence-intervals-for-parameters-of-distribution-in-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!