问题
I frequently find myself doing a long series of chained calculations using :=
on the same data table. For example, something like this
test = data.table(1:10, 1:10, 1:10, 1:10)
test[, V1 := V1^2]
test[, V2 := V1*V2]
test[, V3 := V2/V3]
test[, V4 := sqrt(V3)]
test[, new := letters[V4]]
Having to write test[, ...]
on every line 1) takes longer to type (not a big issue I can cope with that). But, more importantly, also distracts visually from the flow and content of the calculation. I would much rather write something like
test[, {
V1 := V1^2
V2 := V1*V2
V3 := V2/V3
V4 := sqrt(V3)
new := letters[V4]
}]
But this throws an error of You have wrapped := with {} which is ok but then := must be the only thing inside {}
.
I know that I can write
within(test, {
V1 = V1^2
V2 = V1*V2
V3 = V2/V3
V4 = sqrt(V3)
new = letters[V4]
})
But this loses the efficiency of using :=
I tried writing a function to provide this ability
with.dt = function(dt, expressions){
e = strsplit(expressions,'\n')
for (i in 1:length(e)){
dt[, eval(parse(text = e[i]))]
}
dt
}
with.dt(test, "
V1 := V1^2;
V2 := V1*V2;
V3 := V2/V3;
V4 := sqrt(V3);
new := letters[V4];
")
But this does not change the content of the data.table
Is there any syntax that permits the neat look of the within
version, while retaining the assign by reference property of :=
?
回答1:
There are a couple different ways to do this.
Original test matrix used here:
v1 v2
1: 1 3
2: 2 4
First, we can do something like this:
test[,`:=`(v1 = v1^2,
v2 = v1*v2)
Output:
v1 v2
1 3
4 8
Or, if we want it done sequentially we can use Frank's hack.
test[, `:=`(v1 = v1 <- v1^2, v2 = v2 * v1)]
Output:
v1 v2
1 3
4 16
Or, similarly we could run this:
test[,c("v1","v2") := list(v1^2,v1^2*v2)]
Output:
v1 v2
1 3
4 16
回答2:
We can use a function that takes a list of expressions:
with.dt = function(dt, expr){
for (j in 1:length(expr)) set(dt, , names(expr)[j], dt[, eval(expr[[j]])])
}
test = data.table(1:10, 1:10, 1:10, 1:10)
with.dt(test, expression(
V1 = V1^2,
V2 = V1*V2,
V3 = V2/V3,
V4 = sqrt(V3),
new = letters[V4]
))
# V1 V2 V3 V4 new
# 1: 1 1 1 1 a
# 2: 4 8 4 2 b
# 3: 9 27 9 3 c
# 4: 16 64 16 4 d
# 5: 25 125 25 5 e
# 6: 36 216 36 6 f
# 7: 49 343 49 7 g
# 8: 64 512 64 8 h
# 9: 81 729 81 9 i
#10: 100 1000 100 10 j
来源:https://stackoverflow.com/questions/41023553/using-multiple-times-in-data-table