问题
I'm writing a Shiny app where the user will be inputting data for conditions of their samples, and the script will "automatically" match their inputted conditions to sample names of a given file.
For simplicity, I will not include the shiny code, because I am only struggling with the actual R implementation.
If I already know what the potential conditions are, I could do something like:
library(tidyverse)
x <- data.frame(Samples = c('Low1', 'Low2', 'High1', 'High2',
'Ctrl1', 'Ctrl2'))
x <- x %>% mutate(Conditions = case_when(
str_detect(Samples, fixed("low", ignore_case = T)) ~ "low",
str_detect(Samples, fixed("high", ignore_case = T)) ~ "high",
str_detect(Samples, fixed("ctrl", ignore_case = T)) ~ "ctrl"))
And I would get what I am looking for, a data frame like:
Samples Conditions
Low1 low
Low2 low
High1 high
High2 high
Ctrl1 ctrl
Ctrl2 ctrl
However, I want to loop through a vector of potential conditions and do something like:
library(tidyverse)
condition_options <- c('low', 'high', 'ctrl')
x <- data.frame(Samples = samplenames)
for (j in condition_options) {
x <- x %>% mutate(Condition = case_when(
str_detect(Samples, fixed(j, ignore_case = T)) ~ j))
}
When I do this, the Condition
column is re-written only giving me matches for the last value in the vector. For example:
Samples Conditions
Low1 <NA>
Low2 <NA>
High1 <NA>
High2 <NA>
Ctrl1 ctrl
Ctrl2 ctrl
回答1:
This might be easier if you build all parts of your case_when
statement with meta-programming rather than doing a loop. Try
library(tidyverse)
condition_options <- c('low', 'high', 'ctrl')
conditions <- purrr::map(condition_options,
~quo(str_detect(Samples, fixed(!!.x, ignore_case = T))~!!.x))
x <- data.frame(Samples = samplenames)
x %>% mutate(Condition = case_when(!!!conditions) )
# Samples Condition
# 1 Low1 low
# 2 Low2 low
# 3 High1 high
# 4 High2 high
# 5 Ctrl1 ctrl
# 6 Ctrl2 ctrl
Here the map
build all the different formulas you would expect to have in the case_when
statement. Then we use !!!
to insert them into the mutate
expression.
回答2:
library(purrr)
x <- data.frame(Samples = c('Low1', 'Low2', 'High1', 'High2',
'Ctrl1', 'Ctrl2'))
condition_options <- c('low', 'high', 'ctrl')
# iterate through all provided `condition_options `, returns corresponding condition if a match is found, otherwise returns NA
matched_values <- map(condition_options,function(condition_name){
ifelse(
str_detect(x$Samples,fixed(condition_name,ignore_case = TRUE)),
condition_name,
NA_character_
)
})
# if all values are NA, still return NA, otherwise return matched value, it will throw an error if multiple matches are found.
x["Conditions"] <- pmap_chr(values, function(...){
values <- unlist(list(...))
if(all(is.na(values))){
return(NA)
} else {
return(values[!is.na(values)])
}
})
> x
Samples Conditions
1 Low1 low
2 Low2 low
3 High1 high
4 High2 high
5 Ctrl1 ctrl
6 Ctrl2 ctrl
回答3:
I don't think you'll need a loop to do this. We can use str_extract
to extract any value which matches the pattern in condition_options
x$Conditions <- stringr::str_extract(tolower(x$Samples),
paste0(condition_options, collapse = "|"))
x
# Samples Conditions
#1 Low1 low
#2 Low2 low
#3 High1 high
#4 High2 high
#5 Ctrl1 ctrl
#6 Ctrl2 ctrl
In base R, we can also generate the regex dynamically using paste0
x$Conditions <- sub(paste0(".*(", paste0(condition_options, collapse = "|"), ").*"),
"\\1", tolower(x$Samples))
where
paste0(".*(", paste0(condition_options, collapse = "|"), ").*") #gives
#[1] ".*(low|high|ctrl).*"
来源:https://stackoverflow.com/questions/57861055/how-can-i-use-mutate-and-case-when-in-a-for-loop