I have a dataframe with some NA values:
dfa <- data.frame(a=c(1,NA,3,4,5,NA),b=c(1,5,NA,NA,8,9),c=c(7,NA,NA,NA,2,NA))
dfa
I would like t
In the tidyverse, you can use purrr::map2_df
, which is a strictly bivariate version of mapply
that simplifies to a data.frame, and dplyr::coalesce
, which replaces NA
values in its first argument with the corresponding ones in the second.
library(tidyverse)
dfrepair %>%
mutate_all(as.numeric) %>% # coalesce is strict about types
map2_df(dfa, ., coalesce)
## # A tibble: 6 × 3
## a b c
## <dbl> <dbl> <dbl>
## 1 1 1 7
## 2 3 5 7
## 3 3 4 6
## 4 4 3 5
## 5 5 8 2
## 6 7 9 3
We can use Map
from base R
to do a columnwise comparison between the two datasets
dfa[] <- Map(function(x,y) {x[is.na(x)] <- y[is.na(x)]; x}, dfa, dfrepair)
dfa
# a b c
#1 1 1 7
#2 3 5 7
#3 3 4 6
#4 4 3 5
#5 5 8 2
#6 7 9 3
dfa <- data.frame(a=c(1,NA,3,4,5,NA),b=c(1,5,NA,NA,8,9),c=c(7,NA,NA,NA,2,NA))
dfa
dfrepair <- data.frame(a=c(2:7),b=c(6:1),c=c(8:3))
dfrepair
library(dplyr)
coalesce(as.numeric(dfa), as.numeric(dfrepair))
a b c
1 1 1 7
2 3 5 7
3 3 4 6
4 4 3 5
5 5 8 2
6 7 9 3
As the code in dplyr
is written in C++ it is faster in most cases. An other important advantage is that coalesce
as well as many other dplyr
functions are the same in SQL. Using dplyr
you learn SQL by coding in R
. ;-)
You can do:
dfa <- data.frame(a=c(1,NA,3,4,5,NA),b=c(1,5,NA,NA,8,9),c=c(7,NA,NA,NA,2,NA))
dfrepair <- data.frame(a=c(2:7),b=c(6:1),c=c(8:3))
dfa[is.na(dfa)] <- dfrepair[is.na(dfa)]
dfa
a b c
1 1 1 7
2 3 5 7
3 3 4 6
4 4 3 5
5 5 8 2
6 7 9 3