Using stargazer with memory greedy glm objects

坚强是说给别人听的谎言 提交于 2020-01-06 03:07:05

问题


I'm trying to run the following regression:

m1=glm(y~x1+x2+x3+x4,data=df,family=binomial())
m2=glm(y~x1+x2+x3+x4+x5,data=df,family=binomial())
m3=glm(y~x1+x2+x3+x4+x5+x6,data=df,family=binomial())
m4=glm(y~x1+x2+x3+x4+x5+x6+x7,data=df,family=binomial())

and then to print them using the stargazer package:

stargazer(m1,m2,m3,m4 type="html", out="models.html")

Thing is, the data frame df is rather big (~600MB) and thus each glm object I create is at least ~1.5GB. This creates a memory issue which prevents me from creating all the regressions I need to print in stargazer.

I've tried 2 approches in order to decrease the size of the glm objects:

  1. Trim the glm object using this tutorial. This indeed trims the glm object to <1MB, though I get the following error from the stargazer function:
Error in Qr$qr[p1, p1, drop = FALSE] : incorrect number of dimensions
  1. Use the package speedglm. however, it's not supported by stargazer.

Any suggestions?


回答1:


The stargazer calls summary which requires qr (see source code). So -- as far as I know -- it is not possible.

BUT I think that it should be easy to rewrite stargazer to handle a list of summaries as an input. It would be extremely handy.




回答2:


An option that has worked well for me is to first convert the large *lm objects to "coeftest" class using the lmtest package. A "coeftest" object is really just a matrix of your summarised regression results and hardly takes up any space as a result. Moreover, Stargazer readily accepts the "coeftest" class as an input, so your code doesn't need to change much at all.

Using your example:

library(lmtest)

m1 <- glm(y~x1+x2+x3+x4,data=df,family=binomial())
m1 <- coeftest(m1)
m2 <- glm(y~x1+x2+x3+x4+x5,data=df,family=binomial())
m2 <- coeftest(m2)
m3 <- glm(y~x1+x2+x3+x4+x5+x6,data=df,family=binomial())
m3 <- coeftest(m3)
m4 <- glm(y~x1+x2+x3+x4+x5+x6+x7,data=df,family=binomial())
m4 <- coeftest(m4)

stargazer(m1,m2,m3,m4 type="html", out="models.html")

Apart from taking care of the memory problem, this approach has the added benefit of the coeftest() transformation itself being extremely quick. (Well, with the notable exception of times when you ask it to produce robust/clustered standard errors on a particularly large *lm object by invoking the "vcov = vcovHC" option. However, even then, the coeftest() transformation is a necessary step to exporting the robust regression results in the first place.)

A minor downside to this approach is that it doesn't save some regression statistics that may be of interest for your Stargazer table (e.g. R-squared or N). However, you could easily obtain these from the *lm object before converting it.



来源:https://stackoverflow.com/questions/26010742/using-stargazer-with-memory-greedy-glm-objects

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!