问题
I apologize if I post a similar question to one I asked earlier, but I realized that my original question wasn't very clear.
I have a dataframe with five columns and 6 rows (actually they are many more, just trying to simplify matters):
One Two Three Four Five
Cat NA NA NA NA
NA Dog NA NA NA
NA NA NA Mouse NA
Cat NA Rat NA NA
Horse NA NA NA NA
NA NA NA NA NA
Now, I would like to coalesce all the information in a new single column ('Summary'), like this:
Summary
Cat
Dog
Mouse
Error
Horse
NA
Please note the 'Error' reported on the fourth Summary row, because two different values have been reported during the merging.Also please note that in case there are only NAs in a row, it should be reported 'NA' and not 'Error' I tried to look at the 'coalesce' function in the dplyr package, but it really desn't seem to do what I need. Thanks in advance.
回答1:
One base R
option could be:
ifelse(rowSums(!is.na(df)) > 1, "Error", do.call(pmin, c(df, na.rm = TRUE)))
[1] "Cat" "Dog" "Mouse" "Error" "Horse" NA
回答2:
Reducing across the columns of df
(starting with the first), compare the current column (old
) to the next (new
). For each element:
If
old
isNA
, choosenew
If
old
is notNA
, then chooseold
, unlessnew
is also notNA
, then'Error'
:
Reduce(
function(old, new) ifelse(is.na(old), new, ifelse(!is.na(new), 'Error', old)),
df)
# [1] "Cat" "Dog" "Mouse" "Error" "Horse" NA
回答3:
We can use coalesce
from dplyr
. It is possible to have the NA
only column to be of type logical
and this could have a clash when we use coalesce
. One option is to make changes in the class for that column and then coalesce
would work
library(dplyr)
df1 %>%
mutate_if(~ all(is.na(.)) && is.logical(.), ~ NA_character_) %>%
transmute(Summary = case_when(rowSums(!is.na(.)) > 1 ~ "Error",
TRUE ~ coalesce(!!! .)))
# Summary
#1 Cat
#2 Dog
#3 Mouse
#4 Error
#5 Horse
#6 <NA>
Data
df1 <- structure(list(One = c("Cat", NA, NA, "Cat", "Horse", NA), Two = c(NA,
"Dog", NA, NA, NA, NA), Three = c(NA, NA, NA, "Rat", NA, NA),
Four = c(NA, NA, "Mouse", NA, NA, NA), Five = c(NA, NA, NA,
NA, NA, NA)), class = "data.frame", row.names = c(NA, -6L
))
来源:https://stackoverflow.com/questions/59741824/coalescing-many-columns-into-one-column