问题
Is there a built in function that will provide the confidence intervals for parameter estimates in a python package or is this something I will need to implement by hand? I am looking for something similar to matlabs gevfit http://www.mathworks.com/help/stats/gevfit.html.
回答1:
Take a look at scipy
and numpy
in case you haven't already. If you have some familiarity with MATLAB, then the switch should be relatively easy. I've taken this quick snippet from this SO response:
import numpy as np
import scipy as sp
import scipy.stats
def mean_confidence_interval(data, confidence=0.95):
a = 1.0*np.array(data)
n = len(a)
m, se = np.mean(a), scipy.stats.sem(a)
h = se * sp.stats.t.ppf((1+confidence)/2., n-1)
return m, m-h, m+h
You should be able to customize the returns to your liking. Like the MATLAB gevfit
function, it defaults to using 95% confidence bounds.
回答2:
The bootstrap can be used to estimate confidence intervals of any function (np.mean
, st.genextreme.fit
, etc.) of a sample, and there is a Python library: scikits.bootstrap.
Here for the data from the question author's related question:
import numpy as np, scipy.stats as st, scikits.bootstrap as boot
data = np.array([ 22.20379411, 22.99151292, 24.27032696, 24.82180626,
25.23163221, 25.39987272, 25.54514567, 28.56710007,
29.7575898 , 30.15641696, 30.79168255, 30.88147532,
31.0236419 , 31.17380647, 31.61932755, 32.23452568,
32.76262978, 33.39430032, 33.81080069, 33.90625861,
33.99142006, 35.45748368, 37.0342621 , 37.14768791,
38.14350221, 42.72699534, 44.16449992, 48.77736737,
49.80441736, 50.57488779])
st.genextreme.fit(data) # just to check the parameters
boot.ci(data, st.genextreme.fit)
The results are
(-0.014387281261850815, 29.762126238637851, 5.8983127779873605)
array([[ -0.40002507, 26.93511496, 4.6677834 ],
[ 0.19743722, 32.41834882, 9.05026202]])
The bootstrap takes about three minutes on my machine; by default, boot.ci
uses 10,000 bootstrap iterations (n_samples
), see code or help(boot.ci)
, and st.genextreme.fit
is not superfast.
The confidence intervals from boot.ci
do not match the ones from MATLAB's gevfit exactly. E.g., MATLAB gives a symmetric one [-0.3032, 0.3320] for the first parameter (0.0144).
来源:https://stackoverflow.com/questions/31481279/estimate-confidence-intervals-for-parameters-of-distribution-in-python