问题
How do I get leverage/get_influence from a WLS model fit in python statsmodels
Taking the example from http://statsmodels.sourceforge.net/stable/index.html
# Load data
dat = sm.datasets.get_rdataset("Guerry", "HistData").data
# Fit regression model (using the natural log of one of the regressors)
results_ols = smf.ols('Lottery ~ Literacy + np.log(Pop1831)', data=dat).fit()
results_w = smf.wls('Lottery ~ Literacy + np.log(Pop1831)', data=dat).fit()
I can call
results_ols.get_influence
but not results_wls.get_influence()
Is there an equivalent for wls ?
I would be interested in any solutions outside of statsmodels as well.
回答1:
You can get the influence and outlier measures for the weighted variables by using OLS on the weighted variables.
For example if mod_wls is your WLS model (the model instance, not the results instance), then
res = sm.OLS(mod_wls.wendog, mod_wls.wexog).fit()
infl = res.get_influence()
AFAIK, most or all influence measures will be correct but they are in terms of weighted variables and observations. There are some definitions of some the influence measures in terms of the original variables, but those will not be available. For example, there are two ways to define the hat matrix for WLS, one corresponding to using weighted variables as above and another that has the influence in terms of the original variable.
(A similar issue shows up in GLM and RLM which are both based on iteratively reweighted least squares, e.g. https://github.com/statsmodels/statsmodels/issues/808
The influence and outlier statistics have not been extended to other models mostly for a lack of reference to the statistical literature that explicitly handles this case, and for not knowing of a reference implementation in another package that could be used for the unit tests.
update
GLM now has some outlier influence measures
https://www.statsmodels.org/dev/generated/statsmodels.genmod.generalized_linear_model.GLMResults.get_influence.html
but still nothing explicitly for WLS )
来源:https://stackoverflow.com/questions/40621686/does-statsmodels-wls-have-get-influence-function