How to get the same values for AIC and BIC in R as in Stata?

问题

Say I have a very simple model

library(foreign)

smoke <- read.dta("http://fmwww.bc.edu/ec-p/data/wooldridge/smoke.dta")

smoking.reg <- lm(cigs ~ educ, data=smoke)

AIC(smoking.reg)
BIC(smoking.reg)

In R I get the following results:

> AIC(smoking.reg)
[1] 6520.26
> BIC(smoking.reg)
[1] 6534.34

Running the same regression however in Stata

 use http://fmwww.bc.edu/ec-p/data/wooldridge/smoke.dta
 reg cigs educ

returns the following result

estat ic

How can I get R to return exactly the same values as does Stata for AIC and BIC?

回答1:

AIC is calculated as -2*log likelihood + 2* number of parameters
BIC is calculated as -2*log likelihood + log(n)* number of parameters, where n is the sample size.

Your linear regression has three parameters - two coefficients and the variance -- and so you can calculate AIC and BIC as

ll = logLik(smoking.reg)
aic = -2*ll + 2* 3 # 6520.26
bic = -2*ll + log(nrow(smoke))* 3 # 6534.34

(As Ben Bolker mentioned in the comments the logLik object has several attributes which you can use to get the number of parameters ("df") and the number of observations ("nobs"). See attr(ll, "df") and attr(ll, "nobs") )

Stata does not include the variance parameter, only including the number of coefficients. This usually would not be a problem as information criteria are usually used to compare models (AIC_of_model1 - AIC_of_model2) and so if this parameter is omitted in both calculations it will make no difference. In Stata the calculation is

aic = -2*ll + 2* 2 # 6518.26
bic = -2*ll + log(nrow(smoke))* 2 # 6527.647

来源：https://stackoverflow.com/questions/62307197/how-to-get-the-same-values-for-aic-and-bic-in-r-as-in-stata

标签

stata