I\'m new to R programming and I\'m stuck on the example below.
Basically I have two data sets:
dataset1:
ID Category
1 CatZ
Another option is to use data.table
package.
Using the same setup as @tmfmnk in his answer:
Construct the sample data set:
df1 <- read.table(text = "ID Category
1 CatZZ
2 CatVV
3 CatAA
4 CatQQ", header = TRUE, stringsAsFactors = FALSE)
df2 <- read.table(text = "ID Category
1 Cat600
3 Cat611", header = TRUE, stringsAsFactors = FALSE)
Load the data.table
package and convert dataframes to data tables:
library(data.table)
df1 <- data.table(df1)
df2 <- data.table(df2)
Perform a left join
(take all values from df1, where ID matches with df2, and add there the category from df2, then create a new column combining info from df1 and df2)
a <- df2[df1, on = "ID"][, a := ifelse(is.na(Category), i.Category, Category)]
There is a nice question and answer on data.table joins here: Left join using data.table
Also, to get exactly the result you asked for, you can do:
a <- df2[df1, on = "ID"][, list(ID, Category = ifelse(is.na(Category), i.Category, Category))]