I was trying to automate a piece of my code so that programming become less tedious.
Basically I was trying to do a stepwise selection of variables using fastbw()
in the rms package. I would like to pass the list of variables selected by fastbw()
into a formula as y ~ x1+x2+x3
, "x1" "x2" "x3" being the list of variables selected by fastbw()
Here is the code I tried and did not work
olsOAW0.r060 <- ols(roll_pct~byoy+trans_YoY+change18m,
subset= helper=="POPNOAW0_r060",
na.action = na.exclude,
data = modelready)
OAW0 <- fastbw(olsOAW0.r060, rule="p", type="residual", sls= 0.05)
vec <- as.vector(OAW0$names.kept, mode="any")
b <- paste(vec, sep ="+") ##I even tried b <- paste(OAW0$names.kept, sep="+")
bestp.OAW0.r060 <- lm(roll_pct ~ b ,
data = modelready,
subset = helper =="POPNOAW0_r060",
na.action = na.exclude)
I am new to R and still haven't trailed the steep learning curve, so apologize for obvious programming blunders.
You're almost there. You just have to paste
the entire formula together, something like this:
paste("roll_pct ~ ",b,sep = "")
coerce it to an actual formula using as.formula
and then pass that to lm
. Technically, I think lm
may coerce a character string itself, but coercing it yourself is generally safer. (Some functions that expect formulas won't do the coercion for you, others will.)
You would actually need to use collapse instead of seb when defining b.
b <- paste(OAW0$names.kept, collapse="+")
Then you can put it in joran answer
paste("roll_pct ~ ",b,sep = "")
or just use:
paste("roll_pct ~ ",paste(OAW0$names.kept, collapse="+"),sep = "")
I ran into similar issue today, if you want to make it even more generic where you don't even have to have fixed class name, you can use
frmla <- as.formula(paste(colnames(modelready)[1], paste(colnames(modelready)[2:ncol(modelready)], sep = "",
collapse = " + "), sep = " ~ "))
This assumes that you have class variable or the dependent variable in the first column but indexing can be easily switched to last column as:
frmla <- as.formula(paste(colnames(modelready)[ncol(modelready)], paste(colnames(modelready)[1:(ncol(modelready)-1)], sep = "",
collapse = " + "), sep = " ~ "))
Then continue with lm
using:
bestp.OAW0.r060 <- lm(frmla , data = modelready, ... )
If you're looking for something less verbose:
fm <- as.formula( paste( colnames(df)[i], ".", sep=" ~ "))
# i is the index of the outcome column
Here it is in a function:
getFormula<-function(target, df) {
i <- grep(target,colnames(df))
as.formula(paste(colnames(df)[i],
".",
sep = " ~ "))
}
fm <- getFormula("myOutcomeColumnName", myDataFrame)
rp <- rpart(fm, data = myDataFrame) # Use the formula to build a model
just to simplify and collect above answers, based on a function
my_formula<- function(colPosition, trainSet){
dep_part<- paste(colnames(trainSet)[colPosition],"~",sep=" ")
ind_part<- paste(colnames(trainSet)[-colPosition],collapse=" + ")
dt_formula<- as.formula(paste(dep_part,ind_part,sep=" "))
return(dt_formula)
}
To use it:
my_formula( dependent_var_position, myTrainSet)
来源:https://stackoverflow.com/questions/9238038/pass-a-vector-of-variables-into-lm-formula