passing column names to data.table programmatically

后端未结

关注

 3  1204

I would like to be able to write a function that runs regressions in a data.table by groups and then nicely organizes the results. Here is a sample of what I w

相关标签:

3条回答

终归单人心

2020-12-30 07:56

Can't you just add (inside that anonymous function call):

 f <- as.formula(f)

... as a separate line before the dtb[,as.list(coef(lm(f, ...)? That's the usual way of turning a character element into a formula object.

> res = lapply(models, function(f) {f <- as.formula(f)
                 dtb[,as.list(coef(lm(f, weights=weights, data=.SD))),by=thedate]})
> 
> str(res)
List of 2
 $ :Classes ‘data.table’ and 'data.frame':  2 obs. of  3 variables:
  ..$ thedate    : int [1:2] 1 2
  ..$ (Intercept): num [1:2] 11 11
  ..$ x          : num [1:2] -1 -1
  ..- attr(*, ".internal.selfref")=<externalptr> 
 $ :Classes ‘data.table’ and 'data.frame':  2 obs. of  3 variables:
  ..$ thedate    : int [1:2] 1 2
  ..$ (Intercept): num [1:2] 6.27 11.7
  ..$ z          : num [1:2] 0.0633 -0.7995
  ..- attr(*, ".internal.selfref")=<externalptr>

If you need to build character versions of formulas from component names, just use paste or paste0 and pass to the models character vector. Tested code supplied with receipt of testable examples.

0 讨论(0)

遇见更好的自我

2020-12-30 07:59

Here is a solution that relies on having the data in long format (which makes more sense to me, in this cas

library(reshape2)
dtlong <- data.table(melt(dtb, measure.var = c('x','z')))


foo <- function(f, d, by, w ){
  # get the name of the w argument (weights)
  w.char <- deparse(substitute(w))
  # convert `list(a,b)` to `c('a','b')`
  # obviously, this would have to change depending on how `by` was defined
  by <- unlist(lapply(as.list(as.list(match.call())[['by']])[-1], as.character))
  # create the call substituting the names as required
  .c <- substitute(as.list(coef(lm(f, data = .SD, weights = w), list(w = as.name(w.char)))))
  # actually perform the calculations
  d[,eval(.c), by = by]
}

foo(f= y~value, d= dtlong, by = list(variable, thedate), w = weights)

   variable thedate (Intercept)       value
1:        x       1   11.000000 -1.00000000
2:        x       2   11.000000 -1.00000000
3:        z       1    1.009595  0.89019190
4:        z       2    7.538462 -0.03846154

0 讨论(0)

广开言路

2020-12-30 08:03

one possible solution:

fun = function(dtb, models, w_col_name, date_name) {
     res = lapply(models, function(f) {dtb[,as.list(coef(lm(f, weights=eval(parse(text=w_col_name)), data=.SD))),by=eval(parse(text=paste0("list(",date_name,")")))]})

}

0 讨论(0)