问题
I'm trying to use lm() and matchit() on a subset of covariates. I have generated an arbitrary number of columns with prefix "covar", i.e. "covar.1", "covar.2", etc. I'd like to do something like
lm(group ~ covars, data=df)
where covars is a vector of strings c("covar.1", "covar.2", ...).
I tried several things like
cols <- colnames(df)
covars <- cols[grep("covar", colnames(df))]
m.out <- matchit(group ~ covars, data=df, method="nearest", distance="logit", caliper=.20)
but got variable lengths differ (found for 'covars')
.
Defining a new dataframe only with covars and group can work but that defeats my purpose using matchit
because I want the matched data to have other columns, too, not just covars I picked to be the matched on.
This seems to be an easy task but somehow I can't figure out after some googling. Not sure what R formula expects there as subset of columns. Any help is appreciated.
回答1:
You might want to use as.formula
.
Try doing this:
Replace group ~ covars
with as.formula(paste('group','~', paste(covars, collapse="+"))))
回答2:
I mentioned this in your other question, but the cobalt
package has a function specifically for this, which is f.build()
. The first argument to f.build()
is a string containing the name of the treatment variable (or left hand side of the formula), and the second argument is a string vector containing the names of the variables to be on the right hand side of the formula (i.e., the covariates). The second argument can also be a data.frame containing the covariates; f.build()
simply extracts the names. It then performs the operation described in the chosen answer, bit adds in a few other aspects that make it a little more general and robust to errors.
The cobalt
documentation has a section on f.build()
and uses its use with glm()
and matchit()
as examples.
After running matchit()
, you can assess balance on the covariates using the bal.tab()
function in cobalt
, which is compatible with MatchIt
:
bal.tab(m.out, un = TRUE)
The documentation for cobalt
explains its use with MatchIt
in detail.
来源:https://stackoverflow.com/questions/53854697/fit-model-on-a-subset-of-columns-in-dataframe-in-r