Being aware of the danger of using dynamic variable names, I am trying to loop over varios regression models where different variables specifications are choosen. Usually
1) Just use lm(df2)
or if lm
has additional columns beyond what is shown in the question but we just want to regress on x1
and x2
then
df3 <- df2[c("y", var, "x2")]
lm(df3)
The following are optional and only apply if it is important that the formula appear in the output as if it had been explicitly given.
Compute the formula fo
using the first line below and then run lm
as in the second line:
fo <- formula(model.frame(df3))
fm <- do.call("lm", list(fo, quote(df3)))
or just run lm
as in the first line below and then write the formula into it as in the second line:
fm <- lm(df3)
fm$call <- formula(model.frame(df3))
Either one gives this:
> fm
Call:
lm(formula = y ~ x1 + x2, data = df3)
Coefficients:
(Intercept) x1 x2
0.44752 0.04278 0.05011
2) character string lm
accepts a character string for the formula so this also works. The fn$
causes substitution to occur in the character arguments.
library(gsubfn)
fn$lm("y ~ $var + x2", quote(df2))
or at the expense of more involved code, without gsubfn:
do.call("lm", list(sprintf("y ~ %s + x2", var), quote(df2)))
or if you don't care that the formula displays without var
substituted then just:
lm(sprintf("y ~ %s + x2", var), df2)