Complexe non-equi merge in R

后端 未结 2 2060
一向
一向 2020-12-01 09:44

I\'m trying to do a complexe non-equi join between two tables. I got inspired by a presentation in the last useR2016 (https://channel9.msdn.com/events/useR-international-R-

相关标签:
2条回答
  • 2020-12-01 10:27

    For "between" joins like this one, one could also use data.table::foverlaps, which joins two data.table's on ranges that overlap, instead of using non-equi joins.

    Taking the same example, the following code would produce the desired outcome.

    # foverlap tests the overlap of two ranges.  Create a second column,
    # dbh2, as the end point of the range.
    dt1[, dbh2 := dbh]
    
    # foverlap requires the second argument to be keyed
    setkey(dt1, sp, dbh, dbh2)
    
    # find rows where dbh falls between dbh_min and dbh_max, and drop unnecessary
    # columns afterwards
    foverlaps(dt2, dt1, by.x = c("sp", "dbh_min", "dbh_max"), by.y = key(dt1),
              nomatch = 0)[
      ,
      -c("dbh2", "dbh_min", "dbh_max")
    ]
    
    #  sp dbh gr_sp dhb_clas
    #  1: SAB  10   RES        s
    #  2: SAB  12   RES        s
    #  3: SAB  16   RES        m
    #  4: SAB  22   RES        l
    #  5: EPN  12   RES        s
    #  6: EPN  16   RES        m
    #  7: BOP  10   DEC        s
    #  8: BOP  12   DEC        s
    #  9: BOP  14   DEC        s
    # 10: BOP  20   DEC        m
    # 11: BOP  26   DEC        l
    # 12: PET  12   DEC        s
    # 13: PET  16   DEC        s
    # 14: PET  18   DEC        s
    
    0 讨论(0)
  • 2020-12-01 10:32

    So I was very close. I had 2 problems, first a bad installation of the data.table package (Data table error could not find function ".") caused an obscure error.

    After having fixed that, I got closer an found that :

    dt1[dt2, on=.(sp=sp, dbh>=dbh_min, dbh<=dbh_max), nomatch=0]
    

    gave me what I wanted with a bad dbh column. Inverting the command with:

    dt2[dt1, on=.(sp=sp, dbh_min<=dbh, dbh_max>=dbh)]
    

    fixed the problem with only one useless extra column.

    0 讨论(0)
提交回复
热议问题