问题
I'm working with a dataset that has many columns called status1, status2, etc. Within those columns, it says if someone is exempt, complete, registered, etc.
Unfortunately, the exempt inputs are not consistent; here's a sample:
library(dplyr)
problem <- tibble(person = c("Corey", "Sibley", "Justin", "Ruth"),
status1 = c("7EXEMPT", "Completed", "Completed", "Pending"),
status2 = c("exempt", "Completed", "Completed", "Pending"),
status3 = c("EXEMPTED", "Completed", "Completed", "ExempT - 14"))
I'm trying to use case_when() to make a new column that has their final status. If it ever says completed, then they are completed. If it ever says exempt without saying complete, then they are exempt.
The important part is that I want my code to use contains("status"), or some equivalent that only targets the status columns and doesn't require typing them all, and I want it to only require a partial string match for exempt.
As for using contains with case_when, I saw this example, but I wasn't able to apply it to my case: mutate with case_when and contains
This is what I've tried to use so far, but as you can guess, it has not worked:
library(purrr)
library(dplyr)
library(stringr)
solution <- problem %>%
mutate(final= case_when(pmap_chr(select(., contains("status")), ~
any(c(...) == str_detect(., "Exempt") ~ "Exclude",
TRUE ~ "Complete"
))))
Here's what I want the final product to look like:
solution <- tibble(person = c("Corey", "Sibley", "Justin", "Ruth"),
status1 = c("7EXEMPT", "Completed", "Completed", "Pending"),
status2 = c("exempt", "Completed", "Completed", "Pending"),
status3 = c("EXEMPTED", "Completed", "Completed", "ExempT - 14"),
final = c("Exclude", "Completed", "Completed", "Exclude"))
Thank you!
回答1:
I think you are doing it backwards. Put case_when
inside pmap_chr
instead of the other way around:
library(dplyr)
library(purrr)
library(stringr)
problem %>%
mutate(final = pmap_chr(select(., contains("status")),
~ case_when(any(str_detect(c(...), "(?i)Exempt")) ~ "Exclude",
TRUE ~ "Completed")))
For each pmap
iteration (each row of problem
dataset), we want to use case_when
to check if there exists the string Exempt
. (?i)
in str_detect
makes it case insensitive. This is the same as writing str_detect(c(...), regex("Exempt", ignore_case = TRUE))
Output:
# A tibble: 4 x 5
person status1 status2 status3 final
<chr> <chr> <chr> <chr> <chr>
1 Corey 7EXEMPT exempt EXEMPTED Exclude
2 Sibley Completed Completed Completed Completed
3 Justin Completed Completed Completed Completed
4 Ruth Pending Pending ExempT - 14 Exclude
来源:https://stackoverflow.com/questions/56588108/case-when-with-partial-string-match-and-contains