I am new to R and to data.table
, which I find useful and fast. I am trying to join 2 data tables:
> TotFreq
Legacy_Store_Number WeekDay
Update Oct 2014: Arun fixed it in v1.9.5 :
allow.cartesian
is now ignored wheni
has no duplicates, #742 and #508. Thanks to @nigmastar, @user3645882 and others for the reports.
Previous answer ...
First let's address the allow.cartesian
part. The error message should probably be changed to point out that you can get large sizes even if you don't have duplicates in i
, but you have duplicates in the left hand side data.table
. Here's a simple example:
dt1 = data.table(a = c(1,1), b = 1:2, key = 'a')
dt2 = data.table(a = c(1,2), c = 3:4)
dt1[dt2] # this gives an error, because join results in 3 rows, as seen below
dt1[dt2, allow.cartesian = TRUE]
# a b c
#1: 1 1 3
#2: 1 2 3
#3: 2 NA 4
Now as far as setting the key goes - no you don't need to set the key for i
, it will just assume the first few columns are the keys. Looking at your first join result one can see that it was not joined on ItemType
and that you're using an older data.table
version (I'm using 1.9.3). So my guess is that either you didn't actually set the key correctly and didn't include ItemType
or there was some bug in older versions that's been fixed since then.