Fit model on a subset of columns in dataframe in R

天涯浪子 提交于 2019-12-11 20:31:53

问题


I'm trying to use lm() and matchit() on a subset of covariates. I have generated an arbitrary number of columns with prefix "covar", i.e. "covar.1", "covar.2", etc. I'd like to do something like

lm(group ~ covars, data=df)

where covars is a vector of strings c("covar.1", "covar.2", ...).

I tried several things like

  cols <- colnames(df)
  covars <- cols[grep("covar", colnames(df))]
  m.out <- matchit(group ~ covars, data=df, method="nearest", distance="logit", caliper=.20)

but got variable lengths differ (found for 'covars').

Defining a new dataframe only with covars and group can work but that defeats my purpose using matchit because I want the matched data to have other columns, too, not just covars I picked to be the matched on.

This seems to be an easy task but somehow I can't figure out after some googling. Not sure what R formula expects there as subset of columns. Any help is appreciated.


回答1:


You might want to use as.formula.
Try doing this:

Replace group ~ covars

with as.formula(paste('group','~', paste(covars, collapse="+"))))




回答2:


I mentioned this in your other question, but the cobalt package has a function specifically for this, which is f.build(). The first argument to f.build() is a string containing the name of the treatment variable (or left hand side of the formula), and the second argument is a string vector containing the names of the variables to be on the right hand side of the formula (i.e., the covariates). The second argument can also be a data.frame containing the covariates; f.build() simply extracts the names. It then performs the operation described in the chosen answer, bit adds in a few other aspects that make it a little more general and robust to errors.

The cobalt documentation has a section on f.build() and uses its use with glm() and matchit() as examples.

After running matchit(), you can assess balance on the covariates using the bal.tab() function in cobalt, which is compatible with MatchIt:

bal.tab(m.out, un = TRUE)

The documentation for cobalt explains its use with MatchIt in detail.



来源:https://stackoverflow.com/questions/53854697/fit-model-on-a-subset-of-columns-in-dataframe-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!