passing column names to data.table programmatically

后端 未结 3 1201
暖寄归人
暖寄归人 2020-12-30 07:17

I would like to be able to write a function that runs regressions in a data.table by groups and then nicely organizes the results. Here is a sample of what I w

相关标签:
3条回答
  • 2020-12-30 07:56

    Can't you just add (inside that anonymous function call):

     f <- as.formula(f) 
    

    ... as a separate line before the dtb[,as.list(coef(lm(f, ...)? That's the usual way of turning a character element into a formula object.

    > res = lapply(models, function(f) {f <- as.formula(f)
                     dtb[,as.list(coef(lm(f, weights=weights, data=.SD))),by=thedate]})
    > 
    > str(res)
    List of 2
     $ :Classes ‘data.table’ and 'data.frame':  2 obs. of  3 variables:
      ..$ thedate    : int [1:2] 1 2
      ..$ (Intercept): num [1:2] 11 11
      ..$ x          : num [1:2] -1 -1
      ..- attr(*, ".internal.selfref")=<externalptr> 
     $ :Classes ‘data.table’ and 'data.frame':  2 obs. of  3 variables:
      ..$ thedate    : int [1:2] 1 2
      ..$ (Intercept): num [1:2] 6.27 11.7
      ..$ z          : num [1:2] 0.0633 -0.7995
      ..- attr(*, ".internal.selfref")=<externalptr> 
    

    If you need to build character versions of formulas from component names, just use paste or paste0 and pass to the models character vector. Tested code supplied with receipt of testable examples.

    0 讨论(0)
  • 2020-12-30 07:59

    Here is a solution that relies on having the data in long format (which makes more sense to me, in this cas

    library(reshape2)
    dtlong <- data.table(melt(dtb, measure.var = c('x','z')))
    
    
    foo <- function(f, d, by, w ){
      # get the name of the w argument (weights)
      w.char <- deparse(substitute(w))
      # convert `list(a,b)` to `c('a','b')`
      # obviously, this would have to change depending on how `by` was defined
      by <- unlist(lapply(as.list(as.list(match.call())[['by']])[-1], as.character))
      # create the call substituting the names as required
      .c <- substitute(as.list(coef(lm(f, data = .SD, weights = w), list(w = as.name(w.char)))))
      # actually perform the calculations
      d[,eval(.c), by = by]
    }
    
    foo(f= y~value, d= dtlong, by = list(variable, thedate), w = weights)
    
       variable thedate (Intercept)       value
    1:        x       1   11.000000 -1.00000000
    2:        x       2   11.000000 -1.00000000
    3:        z       1    1.009595  0.89019190
    4:        z       2    7.538462 -0.03846154
    
    0 讨论(0)
  • 2020-12-30 08:03

    one possible solution:

    fun = function(dtb, models, w_col_name, date_name) {
         res = lapply(models, function(f) {dtb[,as.list(coef(lm(f, weights=eval(parse(text=w_col_name)), data=.SD))),by=eval(parse(text=paste0("list(",date_name,")")))]})
    
    }
    
    0 讨论(0)
提交回复
热议问题