Joining the data tables:
X <- data.table(A = 1:4, B = c(1,1,1,1))
# A B
# 1: 1 1
# 2: 2 1
# 3: 3 1
# 4: 4 1
Y <- data.table(A = 4)
# A
# 1: 4
<
You're partially correct. The missing piece of the puzzle is that (currently) when you perform any join, including a non-equi join with <
, a single column is returned for the join colum (A
in your example). This columns takes the values from the data.table
on the right side of the join, in this case the values in A
from Y
.
Here's an illustrated example:
We're planning to change this behaviour in a future version of data.table
so that both columns will be returned in the case of non-equi joins. See pull requests https://github.com/Rdatatable/data.table/pull/2706 and https://github.com/Rdatatable/data.table/pull/3093.
When doing a non-equi join like X[Y, on = .(A < A)]
data.table returns the A
-column from Y
(the i
-data.table).
To get the desired result, you could do:
X[Y, on = .(A < A), .(A = x.A, B)]
which gives:
A B 1: 1 1 2: 2 1 3: 3 1
In the next release, data.table will return both A
columns. See here for the discussion.