How to get the same values for AIC and BIC in R as in Stata?

别说谁变了你拦得住时间么 提交于 2021-02-05 07:34:24

问题


Say I have a very simple model

library(foreign)

smoke <- read.dta("http://fmwww.bc.edu/ec-p/data/wooldridge/smoke.dta")

smoking.reg <- lm(cigs ~ educ, data=smoke)

AIC(smoking.reg)
BIC(smoking.reg)

In R I get the following results:

> AIC(smoking.reg)
[1] 6520.26
> BIC(smoking.reg)
[1] 6534.34

Running the same regression however in Stata

 use http://fmwww.bc.edu/ec-p/data/wooldridge/smoke.dta
 reg cigs educ

returns the following result

estat ic

How can I get R to return exactly the same values as does Stata for AIC and BIC?


回答1:


AIC is calculated as -2*log likelihood + 2* number of parameters
BIC is calculated as -2*log likelihood + log(n)* number of parameters, where n is the sample size.

Your linear regression has three parameters - two coefficients and the variance -- and so you can calculate AIC and BIC as

ll = logLik(smoking.reg)
aic = -2*ll + 2* 3 # 6520.26
bic = -2*ll + log(nrow(smoke))* 3 # 6534.34

(As Ben Bolker mentioned in the comments the logLik object has several attributes which you can use to get the number of parameters ("df") and the number of observations ("nobs"). See attr(ll, "df") and attr(ll, "nobs") )

Stata does not include the variance parameter, only including the number of coefficients. This usually would not be a problem as information criteria are usually used to compare models (AIC_of_model1 - AIC_of_model2) and so if this parameter is omitted in both calculations it will make no difference. In Stata the calculation is

aic = -2*ll + 2* 2 # 6518.26
bic = -2*ll + log(nrow(smoke))* 2 # 6527.647


来源:https://stackoverflow.com/questions/62307197/how-to-get-the-same-values-for-aic-and-bic-in-r-as-in-stata

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!