Surely this is not intended? Is this something that happens in other parts of dplyr
\'s functionality and should I be concerned? I love the performance and hat
From the dplyr documentation:
left_join()
returns all rows from
x
, and all columns fromx
andy
. Rows inx
with no match iny
will haveNA
values in the new columns. If there are multiple matches betweenx
andy
, all combinations of the matches are returned.
semi_join()
returns all rows from
x
where there are matching values iny
, keeping just columns fromx
.A semi join differs from an inner join because an inner join will return one row of
x
for each matching row ofy
, where a semi join will never duplicate rows ofx
.
Is semi_join()
a valuable option for you?