How to reliably get dependent variable name from formula object?

前端 未结 7 875
清酒与你
清酒与你 2021-01-31 02:29

Let\'s say I have the following formula:

myformula<-formula(\"depVar ~ Var1 + Var2\")

How to reliably get dependent variable name from formu

7条回答
  •  面向向阳花
    2021-01-31 03:18

    Based on your edit to get the actual response, not just its name, we can use the nonstandard evaluation idiom employed by lm() and most other modelling functions with a formula interface in base R

    form <- formula("depVar ~ Var1 + Var2")
    dat <- data.frame(depVar = rnorm(10), Var1 = rnorm(10), Var2 = rnorm(10))
    
    getResponse <- function(form, data) {
        mf <- match.call(expand.dots = FALSE)
        m <- match(c("formula", "data"), names(mf), 0L)
        mf <- mf[c(1L, m)]
        mf$drop.unused.levels <- TRUE
        mf[[1L]] <- as.name("model.frame")
        mf <- eval(mf, parent.frame())
        y <- model.response(mf, "numeric")
        y
    } 
    
    > getResponse(form, dat)
              1           2           3           4           5 
    -0.02828573 -0.41157817  2.45489291  1.39035938 -0.31267835 
              6           7           8           9          10 
    -0.39945771 -0.09141438  0.81826105  0.37448482 -0.55732976
    

    As you see, this gets the actual response variable data from the supplied data frame.

    How this works is that the function first captures the function call without expanding the ... argument as that contains things not needed for the evaluation of the data for the formula.

    Next, the "formula" and "data" arguments are matched with the call. The line mf[c(1L, m)] selects the function name from the call (1L) and the locations of the two matched arguments. The drop.unused.levels argument of model.frame() is set to TRUE in the next line, and then the call is updated to switch the function name in the call from lm to model.frame. All the above code does is takes the call to lm() and processes that call into a call to the model.frame() function.

    This modified call is then evaluated in the parent environment of the function - which in this case is the global environment.

    The last line uses the model.response() extractor function to take the response variable from the model frame.

提交回复
热议问题