I have two dataframes (df1, df2). I want to fill in the AGE and SEX values from df1 to df2 conditioned on having the same ID between the two. I tried several ways using for-loop
You could use match
with lapply
for this. If we iterate [[
with matching on the ID
column of each of the original data sets over a vector of names, we can get the desired result.
nm <- c("AGE", "SEX")
df2[nm] <- lapply(nm, function(x) df1[[x]][match(df2$ID, df1$ID)])
df2
# ID AGE SEX Conc
# 1 90901 39 0 5.00
# 2 90901 39 0 10.00
# 3 90901 39 0 15.00
# 4 90903 40 1 30.00
# 5 90903 40 1 5.00
# 6 90902 28 0 2.45
# 7 90902 28 0 51.00
# 8 90902 28 0 1.00
# 9 70905 NA NA 0.50
Note that this is also quite a bit faster than merge
.
Try merge(df1, df2, by = "id")
. This will merge your two data frames together. If your example is a good representation of your actual data, then you might want to go ahead and drop the age and sex columns from df2 before you merge.
df2$AGE <- NULL
df2$SEX <- NULL
df3 <- merge(df1, df2, by = "id")
If you need to keep rows from df2 even when you don't have a matching id in df1, then you do this:
df2 <- subset(df2, select = -c(AGE,SEX) )
df3 <- merge(df1, df2, by = "id", all.y = TRUE)
You can learn more about merge
(or any r function) by typing ?merge()
in your r console.