data.table join (Error in vecseq) is key necessary on both on X and i?

后端 未结 1 1272
暗喜
暗喜 2021-02-09 03:04

I am new to R and to data.table, which I find useful and fast. I am trying to join 2 data tables:

> TotFreq
        Legacy_Store_Number WeekDay           


        
相关标签:
1条回答
  • 2021-02-09 03:44

    Update Oct 2014: Arun fixed it in v1.9.5 :

    allow.cartesian is now ignored when i has no duplicates, #742 and #508. Thanks to @nigmastar, @user3645882 and others for the reports.



    Previous answer ...

    First let's address the allow.cartesian part. The error message should probably be changed to point out that you can get large sizes even if you don't have duplicates in i, but you have duplicates in the left hand side data.table. Here's a simple example:

    dt1 = data.table(a = c(1,1), b = 1:2, key = 'a')
    dt2 = data.table(a = c(1,2), c = 3:4)
    
    dt1[dt2] # this gives an error, because join results in 3 rows, as seen below
    
    dt1[dt2, allow.cartesian = TRUE]
    #   a  b c
    #1: 1  1 3
    #2: 1  2 3
    #3: 2 NA 4
    

    Now as far as setting the key goes - no you don't need to set the key for i, it will just assume the first few columns are the keys. Looking at your first join result one can see that it was not joined on ItemType and that you're using an older data.table version (I'm using 1.9.3). So my guess is that either you didn't actually set the key correctly and didn't include ItemType or there was some bug in older versions that's been fixed since then.

    0 讨论(0)
提交回复
热议问题