Cartesian Product using data.table package

后端未结

关注

 3  1224

南旧 2021-02-05 09:51

Using the data.table package in R, I am trying to create a cartesian product of two data.tables using the merge method as one would do in base R.

In base the following w

3条回答

离开以前 (楼主)

2021-02-05 10:10
The solution from @JoshO'Brien uses merge but below is a similar alternative that does not (AFAIK).

If I understand the documentation in ?data.table::merge correctly, X[Y] should be slightly faster than data.table::merge(X,Y) (as of version 1.8.7). It refers to FAQ 2.12 to address this question, but the FAQ is a little confusing. First, the correct reference should be 1.12, not 2.12. And they don't indicate whether they are referring to the base version of merge or the data.table one, or both. So, this might just end up being a messier-looking solution that is equivalent, or it might be faster.

[Edit from Matthew] Thanks : now improved in v1.8.7 (?merge.data.table, FAQ 1.12 and added new FAQ 2.24)
```
DT_orders<-data.table(date=as.POSIXct(c('2012-08-28','2012-08-29','2012-08-29','2012-09-01')),
                      first.name=as.character(c('John','John','George','Henry')),
                      last.name=as.character(c('Doe','Doe','Smith','Smith')),
                      qty=c(10,2,50,6),
                      key="first.name,last.name")

# Note that I added a second record to the orders table for John Doe, to make sure it could handle duplicate first/last name combinations.

DT_dates<-data.table(date=seq(from=as.POSIXct('2012-08-28'),
                              to=as.POSIXct('2012-09-07'),by='day'),
                     key="date")

DT_custdates<-data.table(k=1,unique(DT_dates),key="k")[unique(DT_orders)[,list(k=1,first.name,last.name)]][,k:=NULL]
```
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...