Using := multiple times in data.table

扶醉桌前 提交于 2019-12-12 20:05:18

问题


I frequently find myself doing a long series of chained calculations using := on the same data table. For example, something like this

test = data.table(1:10, 1:10, 1:10, 1:10)

test[, V1 := V1^2]
test[, V2 := V1*V2]
test[, V3 := V2/V3]
test[, V4 := sqrt(V3)]
test[, new := letters[V4]]

Having to write test[, ...] on every line 1) takes longer to type (not a big issue I can cope with that). But, more importantly, also distracts visually from the flow and content of the calculation. I would much rather write something like

test[, {
  V1 := V1^2
  V2 := V1*V2
  V3 := V2/V3
  V4 := sqrt(V3)
  new := letters[V4]
}]

But this throws an error of You have wrapped := with {} which is ok but then := must be the only thing inside {}.

I know that I can write

within(test, {
  V1 = V1^2
  V2 = V1*V2
  V3 = V2/V3
  V4 = sqrt(V3)
  new = letters[V4]
  })

But this loses the efficiency of using :=

I tried writing a function to provide this ability

with.dt = function(dt, expressions){
  e = strsplit(expressions,'\n')
  for (i in 1:length(e)){    
    dt[, eval(parse(text = e[i]))]
  }
  dt
  }

with.dt(test, "
  V1 := V1^2;
  V2 := V1*V2;
  V3 := V2/V3;
  V4 := sqrt(V3);
  new := letters[V4];
  ")

But this does not change the content of the data.table

Is there any syntax that permits the neat look of the within version, while retaining the assign by reference property of :=?


回答1:


There are a couple different ways to do this.

Original test matrix used here:

   v1 v2
1:  1  3
2:  2  4

First, we can do something like this:

test[,`:=`(v1 = v1^2,
          v2 = v1*v2)

Output:

v1 v2
 1  3
 4  8

Or, if we want it done sequentially we can use Frank's hack.

test[, `:=`(v1 = v1 <- v1^2, v2 = v2 * v1)]

Output:

v1 v2
 1  3
 4 16  

Or, similarly we could run this:

test[,c("v1","v2") := list(v1^2,v1^2*v2)]

Output:

v1 v2
 1  3
 4 16



回答2:


We can use a function that takes a list of expressions:

with.dt = function(dt, expr){
  for (j in 1:length(expr)) set(dt, , names(expr)[j], dt[, eval(expr[[j]])])
}

test = data.table(1:10, 1:10, 1:10, 1:10)
with.dt(test, expression(
  V1 = V1^2,
  V2 = V1*V2,
  V3 = V2/V3,
  V4 = sqrt(V3),
  new = letters[V4]
))

#     V1   V2  V3 V4 new
# 1:   1    1   1  1   a
# 2:   4    8   4  2   b
# 3:   9   27   9  3   c
# 4:  16   64  16  4   d
# 5:  25  125  25  5   e
# 6:  36  216  36  6   f
# 7:  49  343  49  7   g
# 8:  64  512  64  8   h
# 9:  81  729  81  9   i
#10: 100 1000 100 10   j


来源:https://stackoverflow.com/questions/41023553/using-multiple-times-in-data-table

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!