R data table: update join

后端 未结 2 1491
伪装坚强ぢ
伪装坚强ぢ 2021-01-03 08:24

Suppose I have two data tables:

X <- data.table(id = 1:5, L = letters[1:5])

   id L
1:  1 a
2:  2 b
3:  3 c
4:  4 d
5:  5 e

Y <- data.table(id = 3:5,         


        
相关标签:
2条回答
  • 2021-01-03 08:42

    I may have miss few things, please correct me if there is a better solution. I usually like to write function for such things.

    Here one: the goal is to have all possibilities available. Join and update as well the join variable, use other variable names ...

    > update.DT <- function(DATA1, DATA2, join.variable, overwrite.variable, overwrite.with.variable) {
    +       
    +       DATA1[DATA2, c(overwrite.variable) := mget(p0("i.", overwrite.with.variable)), on = join.variable][]
    +       
    +     }
    > X <- X2 <- X3 <- data.table(id = 1:5, L = letters[1:5], PS = rep(59, 5))
    > Y <- data.table(id = 3:5, id2 = 11:13, L = c("z", "g", "h"), PS = rep(61, 3))
    > X
       id L PS
    1:  1 a 59
    2:  2 b 59
    3:  3 c 59
    4:  4 d 59
    5:  5 e 59
    > Y
       id id2 L PS
    1:  3  11 z 61
    2:  4  12 g 61
    3:  5  13 h 61
    > update.DT(DATA1 = X, DATA2 = Y, join.variable = "id", overwrite.variable = c("L"), overwrite.with.variable = c("L"))
       id L PS
    1:  1 a 59
    2:  2 b 59
    3:  3 z 59
    4:  4 g 59
    5:  5 h 59
    > update.DT(DATA1 = X2, DATA2 = Y, join.variable = "id", overwrite.variable = c("L", "PS"), overwrite.with.variable = c("L", "PS"))
       id L PS
    1:  1 a 59
    2:  2 b 59
    3:  3 z 61
    4:  4 g 61
    5:  5 h 61
    > update.DT(DATA1 = X2, DATA2 = Y, join.variable = "id", overwrite.variable = c("L", "PS", "id"), overwrite.with.variable = c("L", "PS", "id2"))
       id L PS
    1:  1 a 59
    2:  2 b 59
    3: 11 z 61
    4: 12 g 61
    5: 13 h 61
    
    0 讨论(0)
  • 2021-01-03 08:59

    It is an update join:

    library(data.table)
    X <- data.table(id = 1:5, L = letters[1:5])
    Y <- data.table(id = 3:5, L = c(NA, "g", "h"), N = c(10, NA, 12))
    X[Y, on=.(id), c("L", "N"):=.(i.L, i.N)][]
    #    id  L  N
    # 1:  1  a NA
    # 2:  2  b NA
    # 3:  3 NA 10
    # 4:  4  g NA
    # 5:  5  h 12
    

    gives you the desired result.
    Here I found a solution for multiple columns:

    library(data.table)
    X <- data.table(id = 1:5, L = letters[1:5])
    Y <- data.table(id = 3:5, L = c(NA, "g", "h"), N = c(10, NA, 12))
    
    X[Y, on=.(id), names(Y)[-1]:=mget(paste0("i.", names(Y)[-1]))]
    

    Another variant:

    n <- names(Y)
    X[Y, on=.(id), (n):=mget(paste0("i.", n))]
    
    0 讨论(0)
提交回复
热议问题