R Left Outer Join with 0 Fill Instead of NA While Preserving Valid NA's in Left Table

前端 未结 3 1680
礼貌的吻别
礼貌的吻别 2021-02-18 22:53

What is the easiest way to do a left outer join on two data tables (dt1, dt2) with the fill value being 0 (or some other value) instead of NA (default) without overwriting valid

3条回答
  •  时光说笑
    2021-02-18 23:42

    The cleanest way at present may simply be to seed an intermediary table with the values to be joined on in the left table (dt1), chain a merge of dt2, set NA values to 0, merge intermediary table with dt1. Can be done entirely with data.table and doesn't depend on data.frame syntax, and the intermediary step ensures that there will be no nomatch NA results in the second merge:

    library(data.table);
    dt1 <- data.table(x=c('a', 'b', 'c', 'd', 'e'), y=c(NA, 'w', NA, 'y', 'z'));
    dt2 <- data.table(x=c('a', 'b', 'c'), new_col=c(1,2,3));
    setkey(dt1, x);
    setkey(dt2, x);
    inter_table <- dt2[dt1[, list(x)]];
    inter_table[is.na(inter_table)] <- 0;
    setkey(inter_table, x);
    merged <- inter_table[dt1];
    
    > merged;
       x new_col  y
    1: a       1 NA
    2: b       2  w
    3: c       3 NA
    4: d       0  y
    5: e       0  z
    

    The benefit of this approach is that it doesn't depend on new columns being added on the right and stays inside data.table keyed speed optimizations. Crediting answer to @SamFirke because his solution also works and may be more useful in other contexts.

提交回复
热议问题