Writing Functions With a “data” Argument

后端 未结 2 1605
陌清茗
陌清茗 2021-01-23 23:46

I want to write a function that can take columns within a data frame or column names and the data frame they come from as arguments.

df <- data.frame(x = c(1:         


        
相关标签:
2条回答
  • 2021-01-24 00:47

    To sort of piggy-back off of Cettt - something like this may be what you're looking for:

    df <- data.frame(x = c(1:5), y = c(6:10), z = LETTERS[1:5])
    
    my_fxn <- function (aaa, bbb, ccc, data) {
      if (!missing(data)) {
        aaa = as.numeric(data[[aaa]])
        bbb = as.numeric(data[[bbb]])
        ccc = as.character(data[[ccc]])
      }
      print(aaa[1])
    }
    
    my_fxn("x", "y", "z", df)
    #> [1] 1
    

    With the use of enquo() from library(dplyr), we no longer need to enter characters as the function variables:

    library(dplyr)
    
    my_fxn <- function (aaa, bbb, ccc, data) {
      aaa <- enquo(aaa)
      bbb <- enquo(bbb)
      ccc <- enquo(ccc)
    
      if (!missing(data)) {
        aaa = as.numeric(pull(data, !!aaa))
        bbb = as.numeric(pull(data, !!bbb))
        ccc = as.character(pull(data, !!ccc))
      }
      print(aaa[1])
    }
    
    my_fxn(x, y, z, df)
    #> [1] 1
    

    More info about function building with enquo() and !! can be found here: https://dplyr.tidyverse.org/articles/programming.html#programming-recipes


    Finally, a base R solution using deparse() and substitute():

    my_fxn <- function (aaa, bbb, ccc, data) {
      aaa <- deparse(substitute(aaa))
      bbb <- deparse(substitute(bbb))
      ccc <- deparse(substitute(ccc))
    
      if (!missing(data)) {
        aaa = as.numeric(data[[aaa]])
        bbb = as.numeric(data[[bbb]])
        ccc = as.character(data[[ccc]])
      }
      print(aaa[1])
    }
    
    my_fxn(x, y, z, df)
    #> [1] 1
    
    0 讨论(0)
  • 2021-01-24 00:49

    The problem is that when calling my_fxn(x, y, z, df) the object x is not defined. Hence df$x does not return column x but NA.

    Consider this small example:

    df <- data.frame(x = 1:3, y = 4:6)
    x <- "y"
    df$x # returns column x
    [1] 1 2 3
    df[,x] #returns column y since the value which is stored in x is "y"
    [1] 4 5 6
    

    To circumvent your problem you can use data[, aaa] instead of data$aaa. Yet another alternative would be to use the dplyr package where you can use select(data, aaa).

    0 讨论(0)
提交回复
热议问题