Function which runs lm over different variables

六月ゝ 毕业季﹏ 提交于 2020-01-02 05:22:55

问题


I would like to create a function which can run a regression model (e.g. using lm) over different variables in a given dataset. In this function, I would specify as arguments the dataset I'm using, the dependent variable y and the independent variable x. I want this to be a function and not a loop as I would like to call the code in various places of my script. My naive function would look something like this:

lmfun <- function(data, y, x) {
  lm(y ~ x, data = data)
}

This function obviously does not work because the lm function does not recognize y and x as variables of the dataset.

I have done some research and stumbled upon the following helpful vignette: programming with dplyr. The vignette gives the following solution to a similar problem as the one I am facing:

df <- tibble(
  g1 = c(1, 1, 2, 2, 2),
  g2 = c(1, 2, 1, 2, 1),
  a = sample(5),
  b = sample(5)
)

my_sum <- function(df, group_var) {
  group_var <- enquo(group_var)
  df %>%
    group_by(!! group_var) %>%
    summarise(a = mean(a))
}

I am aware that lm is not a function that is part of the dplyr package but would like to come up with a solution similar as this. I've tried the following:

lmfun <- function(data, y, x) {
  y <- enquo(y)
  x <- enquo(x)

  lm(!! y ~ !! x, data = data)
}

lmfun(mtcars, mpg, disp)

Running this code gives the following error message:

Error in is_quosure(e2) : argument "e2" is missing, with no default

Anyone has an idea on how to amend the code to make this work?

Thanks,

Joost.


回答1:


You can fix this problem by using the quo_name's and formula:

lmfun <- function(data, y, x) {
  y <- enquo(y)
  x <- enquo(x)

  model_formula <- formula(paste0(quo_name(y), "~", quo_name(x)))
  lm(model_formula, data = data)
}

lmfun(mtcars, mpg, disp)

# Call:
#   lm(formula = model_formula, data = data)
# 
# Coefficients:
#   (Intercept)         disp  
#      29.59985     -0.04122  



回答2:


Another solution:

lmf2 <- function(data,y,x){
  fml <- substitute(y~x, list(y=substitute(y), x=substitute(x)))
  lm(eval(fml), data)
}

lmf2(mtcars, mpg, disp)
# Call:
# lm(formula = eval(fml), data = data)
# 
# Coefficients:
# (Intercept)         disp  
#    29.59985     -0.04122  

Or, equivalently:

lmf3 <- function(data,y,x){
  lm(eval(call("~", substitute(y), substitute(x))), data)
}



回答3:


If the arguments are unquoted, then convert to symbol (sym) after changing the quosure to string (quo_name) and evaluate the expression in lm (similar to the OP's syntax of lm)

library(rlang)
lmfun <- function(data, y, x) {
  y <- sym(quo_name(enquo(y)))
  x <- sym(quo_name(enquo(x)))
  expr1 <- expr(!! y ~ !! x)

  model <- lm(expr1, data = data)
  model$call$formula <- expr1 # change the call formula
  model
}

lmfun(mtcars, mpg, disp)
#Call:
#lm(formula = mpg ~ disp, data = data)

#Coefficients:
#(Intercept)         disp  
#   29.59985     -0.04122  

An option if we are passing strings would be convert to symbols with ensym and then quote it in lm

lmfun <- function(data, y, x) {
  y <- ensym(y)
  x <- ensym(x)
  expr1 <- expr(!! y ~ !! x)

  model <- lm(expr1, data = data)
  model$call$formula <- expr1 # change the call formula
  model

}

lmfun(mtcars, 'mpg', 'disp')
#Call:
#lm(formula = mpg ~ disp, data = data)


#Coefficients:
#(Intercept)         disp  
#   29.59985     -0.04122  

NOTE: Both the options are from tidyverse




回答4:


Here is another option: EDIT: Here is a refactored answer

lmfun<-function(data,yname,xname){
 formula1<-as.formula(paste(yname,"~",xname))
  lm.fit<-do.call("lm",list(data=quote(data),formula1))
  lm.fit
}
lmfun(mtcars,"mpg","disp")

And the Original Answer:

 lmfun<-function(data,y,x){
      formula1<-as.formula(y~x)
      lm.fit<-do.call("lm",list(data=quote(data),formula1))
      lm.fit
    }
lmfun(mtcars,mtcars$mpg,mtcars$disp)

Yields:

Call:
lm(formula = y ~ x, data = data)

Coefficients:
(Intercept)            x  
   29.59985     -0.04122  


来源:https://stackoverflow.com/questions/54060985/function-which-runs-lm-over-different-variables

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!