问题
I have two data.table(dt1 & dt2). dt1 is past product data and dt2 is present product data. I want to create a third data.table that inserts new rows from dt2 into dt1 only when product characteristics(Level or Color) are different or Product itself is different.
library(data.table)
dt1 <- fread('
Product Level Color ReviewDate
A 0 Blue 9/7/2016
B 1 Red 9/7/2016
C 1 Purple 9/7/2016
D 2 Blue 9/7/2016
E 1 Green 9/7/2016
F 4 Yellow 9/7/2016
')
dt2 <- fread('
Product Level Color ReviewDate
A 1 Black 9/8/2016
B 1 Red 9/8/2016
C 5 White 9/8/2016
D 2 Blue 9/8/2016
E 1 Green 9/8/2016
F 4 Yellow 9/8/2016
G 3 Orange 9/8/2016
')
My final data.table(dt3) should have the following changes:A and C are both different in dt2 than dt1, thats why the new(different) rows from dt2 gets inserted into the final table alongside all rows from dt1. G is a totally new product that was not in dt1, thats why it makes it into the final table.
Product Level Color ReviewDate
A 0 Blue 9/7/2016
A 1 Black 9/8/2016
B 1 Red 9/7/2016
C 1 Purple 9/7/2016
C 5 White 9/8/2016
D 2 Blue 9/7/2016
E 1 Green 9/7/2016
F 4 Yellow 9/7/2016
G 3 Orange 9/8/2016
I have tried:
setkey(dt1, Product)
setkey(dt2, Product)
dt3<- dt1[dt2]
setkey(dt3,Product,ReviewDate)
回答1:
You can stack and uniqify:
unique(rbind(dt1, dt2), by=c("Product", "Level", "Color"))
回答2:
Another alternative is to only rbind the subset of the data which is different (avoids the creation of one big data.table which contains dt1 and dt2)
dt3 <- rbind(dt1, setDT(dt2)[!dt1, on=c("Product", "Level", "Color")])
dt3[order(Product, ReviewDate),]
回答3:
Using merge...
d<-merge(dt1, dt2, by=c("Product","Level","Color"), all.x=T,all.y=TRUE)
d$ReviewDate <-ifelse(is.na(d$ReviewDate.x), d$ReviewDate.y, d$ReviewDate.x)
as.data.frame(select(d, 1,2,3,6))
Product Level Color ReviewDate
1 A 0 Blue 9/7/2016
2 A 1 Black 9/8/2016
3 B 1 Red 9/7/2016
4 C 1 Purple 9/7/2016
5 C 5 White 9/8/2016
6 D 2 Blue 9/7/2016
7 E 1 Green 9/7/2016
8 F 4 Yellow 9/7/2016
9 G 3 Orange 9/8/2016
来源:https://stackoverflow.com/questions/39398135/updating-data-table-by-inserting-new-rows-that-are-different-from-old-rows