I\'m trying to combine 2 csv files into one file. They have common id with different size. I used merge() but I got replicated data. I have the following data-frames;
An option is regex_left_join
from fuzzyjoin
library(fuzzyjoin)
library(dplyr)
regex_left_join(data2, data1, by = c("SR", "class" = "school")) %>%
select(SR = SR.x, school, class, Y)
# SR school class Y
# 1 SR1 S-1 S-1.2 3
# 2 SR1 S-1 S-1.5 3
# 3 SR1 S-1 S-1.7 3
# 4 SR2 S-1 S-1.1 4
# 5 SR2 S-1 S-1.2 4
# 6 SR2 S-1 S-1.3 4
# 7 SR2 S-1 S-1.6 4
# 8 SR2 S-2 S-2.3 1
# 9 SR2 S-2 S-2.9 1
# 10 SR2 S-4 S-4.2 2
# 11 SR3 S-2 S-2.1 5
# 12 SR3 S-2 S-2.3 5
# 13 SR4 S-1 S-1.5 2
# 14 SR4 S-1 S-1.6 2
# 15 SR4 S-5 S-5.1 3
Could you edit your problem and use dput
to put your two df's into a form that would be easier for us to grab?
Having said that, you need to do something like
# NOT RUN
library(tidyverse)
RESULT <- data2 %>%
mutate(comparison.id = str_detect(outcome.id, "^.+\\d+")) %>%
inner_join(data1, by = c("SR.id", "comparison.id"))