data.table: lapply a function with multicolumn output

风格不统一 提交于 2019-12-05 11:03:08
Frank

The j argument in DT[i,j,by] expects a list, so use as.list:

dt[, 
  Reduce(c, lapply(.SD, function(x) as.list(smean.cl.normal(x))))
, by = gr, .SDcols = "x"]

#    gr       Mean      Lower     Upper
# 1:  A  0.1032966 -0.1899466 0.3965398
# 2:  B -0.1437617 -0.4261330 0.1386096

c(L1, L2, L3) is how lists are combined, so Reduce(c, List_o_Lists) does the trick in case your .SDcols contains more than just x. I guess do.call(c, List_o_Lists) should also work.


Comments

This is quite inefficient for a couple of reasons. Turn on verbose=TRUE to see that data.table doesn't like getting named lists in j:

The result of j is a named list. It's very inefficient to create the same names over and over again for each group. When j=list(...), any names are detected, removed and put back after grouping has completed, for efficiency. Using j=transform(), for example, prevents that speedup (consider changing to :=). This message may be upgraded to warning in future.

Also, you are missing out on group-optimized versions of mean and other functions that can probably be used to build your result. This may not be a big deal for your use-case, though.


When you're applying this to only a single value column, just:

dt[, as.list(smean.cl.normal(x)), by = gr]

suffices.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!