I have two tables in different dimensions, now I want to replace value datA$swl1 with values in datB$swl2 according to userids.
datA
id swl1
1
You can use merge
to match by id
, then replace in column swl1
those items from datB
which exist:
datC <- merge(datA, datB, all.x=TRUE)
datC
## id swl1 swl2
## 1 1 0.8 0.8
## 2 2 0.7 NA
## 3 3 0.4 0.6
## 4 4 0.7 NA
## 5 5 0.0 0.7
This matches up the rows. Now to replace those values in column swl1
with the non-NA
values from column swl2
:
datC$swl1 <- ifelse(is.na(datC$swl2), datC$swl1, datC$swl2)
datC$swl2 <- NULL
datC
## id swl1
## 1 1 0.8
## 2 2 0.7
## 3 3 0.6
## 4 4 0.7
## 5 5 0.7
You can obtain this result with one line of code:
datA$swl1[datA$id %in% datB$id] <- datB$swl2
#> datA
# id swl1
#1 1 0.8
#2 2 0.7
#3 3 0.6
#4 4 0.7
#5 5 0.7
With the %in%
operator we select the entries of the column datA$swl1
that belong to rows with the same id
as those listed in datB
. These values in the column of datA$swl1
are then replaced with the entries of the swl2
column of datB
.
IIUC, using data.table v1.9.5:
require(data.table)
setDT(datA)[datB, swl1 := swl2, on = "id"]
datA
is updated by reference.
If you'd like to select the largest value, regardless of which column it is in, you could try
library(dplyr)
datA <- data.frame(id=c(1,2,3,4,5), swl1=c(0.8, 0.7, 0.4, 0.7, 0.0))
datB <- data.frame(id=c(1,3,5), somename=c(0.8, 0.6, 0.7))
datC <- full_join(datA, datB)
datA <- data.frame(id=c(1:5))
datA$swli1 <- apply(datC[, c('swl1', 'somename')], 1, function(x) max(na.omit(x)))
> datA
id swli1
1 1 0.8
2 2 0.7
3 3 0.6
4 4 0.7
5 5 0.7