Joining tables based on different column names

后端未结

关注

 3  1140

I was watching a video[1] by Greg Reda about Pandas to see what Pandas can do how it compares with data.table. I was surprised to learn how difficult it was to join tables in d

相关标签:

3条回答

被撕碎了的回忆

2021-01-21 00:08
Normally, when joining data.tables the column names don't actually matter. You just need to make sure both tables have a compatible key.
```
library(data.table)
dt1<-data.table(a=letters[1:10], b=1:10)
setkey(dt1,a)
dt2<-data.table(x=letters[1:10], y=10:1)
setkey(dt2,x)

dt1[dt2]
```
Basically it will join on all the key columns. Normally you are joining on a key. If you really need to specify non-key columns, you can always cast the data.table to a data.frame and use the standard merge() function
```
merge(as.data.frame(dt1),dt2, by.x="a", by.y="x")
merge(as.data.frame(dt1),dt2, by.x="b", by.y="y")
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
醉梦人生

2021-01-21 00:16

With reference to the Rdatatable github page, if you want to perform functions on your join rather than just merge tables, you can also do d1[d2, somefunc, on = "A==W"], where A is your column in d1 and W is your column in d2.

0 讨论(0)
发布评论:

提交评论
- 加载中...
独厮守ぢ

2021-01-21 00:18
Update: All the features listed below are implemented and is available in the current stable version of data.table v1.9.6 on CRAN.

There are at least these improvements possible for joins in data.tables.
- merge.data.table gaining by.x and by.y arguments
- Using secondary keys to join using both forms discussed above without need to set keys, but rather by specifying columns on x and i.
The simplest reason is that we've not managed to get to it yet.
0 讨论(0)
发布评论:

提交评论
- 加载中...