still understanding this great package... Could anyone please explain me the reason of this error? Thanks!
library(data.table)
DT <- data.table(id = LETTER
First, it is recommended to use :=
instead of [<-
for efficiency. The [<-
is mostly provided for backward consistency. So, I'll first illustrate how to efficiently use :=
to get what you're after. :=
is assignment by reference (and it updates a data.table without copying the data, therefore extremely fast).
require(data.table)
DT <- data.table(x = 1:5, y = 6:10, z = 11:15)
Suppose you want to change the 2nd row of "y" to that of 5th row of "y":
DT[2, y := DT[5, y]]
or equivalently
DT[2, `:=`(y = DT[5, y])]
Suppose you want to change the 2nd row of both "y" and "z" to that of the corresponding entries in row 5, then:
DT[2, c("y", "z") := as.list(DT[5, c(y, z)])]
or equivalently
DT[2, `:=`(y = DT[5, y], z = DT[5, z])]
Now just to show you how to assign using [<-
(while it is clearly not recommended), it can be done as follows:
DT <- data.table(x = 1:5, y = 6:10, z = 11:15)
DT[1, c("y", "z")] <- as.list(DT[5, c(y, z)])
or equivalently, you can also pass the column number:
DT[1, 2:3] <- as.list(DT[5, c(y, z)])
Hope this helps.
First, the RHS has to be a list for [<-data.table
if it has more than 1 columns to be assigned to.
Second, j
argument on the left of <-
is not evaluated within the environment of your data.table. So, it needs to know what the values for j
are. And since you provide var1
and var2
(without the double quotes that would make them a character vector), it is understood to be a variable. And so, it checks for variables var1
and var2
, but since it doesn't "see" the columns within your data.table as variables (like it normally does when you do assignments etc on the RHS of <-
), it'll look for the same variables in its parent environment which is the global environment where it doesn't find them and so you get the error. For ex: do this:
y <- "y"
z <- "z"
# And now try your second case:
DT[2, c(y, z)] <- as.list(DT[5, c(y, z)])
# the left side takes values from the assignments you made above
# the right side y and z are evaluated within the environment of your data.table
# and so it sees the columns y and z as variables and their values are picked accordingly
Third, the [<-data.table
function accepts only atomic
(vector) types for j
argument. So, your first assignment DT[2, list(var1, var2)] <- DT[8, list(var1, var2)]
will still give an error if you do it the right way, that is:
y <- "y"
z <- "z"
DT[2, list(y, z)] <- as.list(DT[5, c(y, z)])
# Error in `[<-.data.table`(`*tmp*`, 2, list(y, z), value = list(10L, 15L)) :
# j must be atomic vector, see ?is.atomic
hope this helps.
[<-
but not when :=
,DT <- data.table(x = 1:5, y = 6:10, z = 11:15)
tracemem(DT)
# [1] "<0x7fbefb89b580>"
DT[1, c("y", "z") := list(100L, 110L)]
tracemem(DT)
# [1] "<0x7fbefb89b580>"
DT[2, c("y", "z")] <- list(200L, 201L)
# tracemem[0x7fbefacc4fa0 -> 0x7fbefd297838]: # copied, inefficient