R custom data.table function with multiple variable inputs

前端 未结 2 516
后悔当初
后悔当初 2021-01-14 04:44

I am writing a custom aggregation function with data.table (v 1.9.6) and struggle to pass function arguments to it. there have been similar questions on this but none deals

2条回答
  •  北海茫月
    2021-01-14 05:10

    Here's an option using mget, as commented:

    fn_agg <- function(DT, var_list, var_name_list, by_var_list, order_var_list) {
    
      temp <- DT[, setNames(lapply(.SD, sum, na.rm = TRUE), var_name_list), 
                 by = by_var_list, .SDcols = var_list]
    
      setorderv(temp, order_var_list)
    
      cols1 <- paste0(var_name_list, "_del")
      cols2 <- paste0(cols1, "_rel")
    
      temp[, (cols1) := lapply(mget(var_name_list), function(x) {
        x - shift(x, n = 1, type = "lag")
      })]
    
      temp[, (cols2) := lapply(mget(var_name_list), function(x) {
        xshift <- shift(x, n = 1, type = "lag")
        (x - xshift) / xshift
      })]
    
      temp[]
    }
    
    fn_agg(dt, 
           var_list = c("x", "y"), 
           var_name_list = c("x_sum", "y_sum"), 
           by_var_list = c("a", "b"), 
           order_var_list = c("a", "b"))
    
    #   a b x_sum y_sum x_sum_del y_sum_del x_sum_del_rel y_sum_del_rel
    #1: a e   254   358        NA        NA            NA            NA
    #2: b f   246   116        -8      -242  -0.031496063    -0.6759777
    #3: c g   272   242        26       126   0.105691057     1.0862069
    #4: d h   273   194         1       -48   0.003676471    -0.1983471
    

    Instead of mget, you could also make use of data.table's .SDcols argument as in

    temp[, (cols1) := lapply(.SD, function(x) {
        x - shift(x, n = 1, type = "lag")
      }), .SDcols = var_name_list]
    

    Also, there are probably ways to improve the function by avoiding duplicated computation of shift(x, n = 1, type = "lag") but I only wanted to demonstrate a way to use data.table in functions.

提交回复
热议问题