I have two datasets
a
= raw dataset with thousands of observations of different weather events
STATE EVTYPE
1 AL WINTER STORM
2 AL TORNADO
3 AL TSTM WIND
4 AL TSTM WIND
5 AL TSTM WIND
6 AL HAIL
7 AL HIGH WIND
8 AL TSTM WIND
9 AL TSTM WIND
10 AL TSTM WIND
b
= a dictionary table, which has a standard spelling for some weather events.
EVTYPE evmatch
1 HIGH SURF ADVISORY <NA>
2 COASTAL FLOOD COASTAL FLOOD
3 FLASH FLOOD FLASH FLOOD
4 LIGHTNING LIGHTNING
5 TSTM WIND <NA>
6 TSTM WIND (G45) <NA>
both are merged into df_new
by evtype
library(dplyr)
df_new <- left_join(a, b, by = c("EVTYPE"))
STATE EVTYPE evmatch
1 AL WINTER STORM WINTER STORM
2 AL TORNADO NA
3 AL TSTM WIND THUNDERSTORM WIND
4 AL TSTM WIND THUNDERSTORM WIND
5 AL TSTM WIND THUNDERSTORM WIND
6 AL HAIL NA
7 AL HIGH WIND HIGH WIND
8 AL TSTM WIND THUNDERSTORM WIND
9 AL TSTM WIND THUNDERSTORM WIND
10 AL TSTM WIND THUNDERSTORM WIND
11 AL HEAVY RAIN NA
12 AL FLASH FLOOD NA
13 AL TSTM WIND THUNDERSTORM WIND
14 AL HEAVY RAIN NA
15 AL TSTM WIND THUNDERSTORM WIND
Fill in the missing NAs
As you can see in the df_new$evmatch
, there are a NAs. How can I merge the dataset, but have all NA's in evmatch
filled in by the corresponding word from EVTYPE
. For example...
Wanted output
STATE EVTYPE evmatch
1 AL WINTER STORM WINTER STORM
2 AL TORNADO TORNADO
3 AL TSTM WIND THUNDERSTORM WIND
4 AL TSTM WIND THUNDERSTORM WIND
5 AL TSTM WIND THUNDERSTORM WIND
6 AL HAIL HAIL
7 AL HIGH WIND HIGH WIND
8 AL TSTM WIND THUNDERSTORM WIND
9 AL TSTM WIND THUNDERSTORM WIND
10 AL TSTM WIND THUNDERSTORM WIND
11 AL HEAVY RAIN HEAVY RAIN
12 AL FLASH FLOOD FLASH FLOOD
13 AL TSTM WIND THUNDERSTORM WIND
14 AL HEAVY RAIN HEAVY RAIN
15 AL TSTM WIND THUNDERSTORM WIND
user3200293
Answers as given in the comments to the question:
1: using base R
Method 1:
df_new$evmatch <- with(df_new, ifelse(is.na(evmatch), EVTYPE, evmatch))
Method 2:
df_new$evmatch[is.na(df_new$evmatch] <- df_new$EVTYPE[is.na(df_new$evmatch]
Note: Make sure that both vars are characters or erroneous results will occur. If needed transform with as.character
.
2: using data.table
library(data.table)
setDT(df_new)[is.na(evmatch), evmatch := EVTYPE]
3: using dplyr
library(dplyr)
filter(df_new, is.na(evmatch) %>%
select(evmatch) <- filter(df_new, is.na(evmatch) %>%
select(EVTYPE)
来源:https://stackoverflow.com/questions/34979396/partially-merge-two-datasets-and-fill-in-nas-in-r