Cartesian Product using data.table package

后端 未结 3 1225
南旧
南旧 2021-02-05 09:51

Using the data.table package in R, I am trying to create a cartesian product of two data.tables using the merge method as one would do in base R.

In base the following w

3条回答
  •  粉色の甜心
    2021-02-05 10:09

    merge.data.table(x, y) is a convenience function that wraps a call to x[y], so the merge needs to be based on columns that are in both data.tables. (That's what that error message is trying to tell you).

    One work-around is to add a dummy column to both data.tables, whose only purpose is to make the merge possible:

    ## Add a column "k", and append it to each data.table's vector of keyed columns.
    setkeyv(cust.dt[,k:=1], c(key(cust.dt), "k"))
    setkeyv(dates.dt[,k:=1], c(key(dates.dt), "k"))
    
    ## Merge and then remove the dummy column
    res <- merge(dates.dt, cust.dt, by="k")
    head(res[,k:=NULL])
    #          date first.name last.name
    # 1: 2012-08-28     George     Smith
    # 2: 2012-08-28      Henry     Smith
    # 3: 2012-08-28       John       Doe
    # 4: 2012-08-29     George     Smith
    # 5: 2012-08-29      Henry     Smith
    # 6: 2012-08-29       John       Doe
    
    ## Maybe also clean up cust.dt and dates.dt    
    # cust.dt[,k:=NULL]
    # dates.dt[,k=NULL]
    

提交回复
热议问题