My problem is very similar to the one posted here.
The difference is that they knew the columns that would be conflicting whereas I need a generic method that wont k
A data.table
solution:
dt1 <- data.table(read.table(header=T, text="Date Time ColumnA ColumnB
01/01/2013 08:00 10 30
01/01/2013 08:30 15 25
01/01/2013 09:00 20 20
02/01/2013 08:00 25 15
02/01/2013 08:30 30 10
02/01/2013 09:00 35 5"))
dt2 <- data.table(read.table(header=T, text="Date ColumnA ColumnB ColumnC
01/01/2013 100 300 1
02/01/2013 200 400 2"))
setkey(dt1, "Date")
setkey(dt2, "Date")
# Note: The ColumnC assignment has to be come before the summing operations
# Else it gives out error (see below)
dt1[dt2, `:=`(ColumnC = i.ColumnC, ColumnA = ColumnA + i.ColumnA,
ColumnB = ColumnB + i.ColumnB)]
# Date Time ColumnA ColumnB ColumnC
# 1: 01/01/2013 08:00 110 330 1
# 2: 01/01/2013 08:30 115 325 1
# 3: 01/01/2013 09:00 120 320 1
# 4: 02/01/2013 08:00 225 415 2
# 5: 02/01/2013 08:30 230 410 2
# 6: 02/01/2013 09:00 235 405 2
I'm not sure why placing ColumnC
assignment on the right end throws this error. Perhaps MatthewDowle could explain the cause for this error.
dt1[dt2, `:=`(ColumnA = ColumnA + i.ColumnA, ColumnB = ColumnB + i.ColumnB,
ColumnC = i.ColumnC)]
Error in `[.data.table`(dt1, dt2, `:=`(ColumnA = ColumnA + i.ColumnA, :
Value of SET_STRING_ELT() must be a 'CHARSXP' not a 'NULL'
Update from v1.8.9 :
o Mixing adding new with updating existing columns into one
:=
() by group; i.e.,
DT[,
:=(existingCol=...,newCol=...), by=...]
now works without error or segfault, #2778 and #2528. Many thanks to Arun for reporting both with reproducible examples. Tests added.