问题
I am looking for a way to use a list of filter arguments to produce different objects. I have a data set for which I want to make several graphs. However, I want all these graphs based on subsets of the dataset. For illustrative purposes I have made the following data.
df <- data.frame(type = c("b1", "b2", "b1", "b2"),
yield = c("15", "10", "5", "0"),
temperature = c("2", "21", "26", "13"),
Season = c("Winter", "Summer", "Summer", "Autumn"),
profit = c(TRUE, TRUE, FALSE, FALSE))
Also, I have a list of filter arguments.
filters <- c("brand=='b1'",
"profit",
"Season=='Summer'",
"profit==FALSE",
"yield >= 10",
"")
What I would want is that I could use a for loop to have all these filters produce objects with the filtered data, and subsequently plot graphs. I have tried this in the following way.
for(i in 1:length(filters)){
assign(paste0("df", i), filter(df, factor(filters[i])))
assign(paste0("plot", i), ggplot(database, aes(x = temperature, y = yield)) + geom_point())
}
However, this did not work because the filter()
function does not accept <fct>
as an argument, nor <chr>
(e.g., "brand=='b1'"
). What I would want is brand=='b1'
, so filter()
accepts it as an argument. Does anybody have an idea to do this?
Also, as an additional question, I would like to automate the whole process and end with an combined graph, so grid.arrange()
at the end. Of course I could automate the ncol
and nrow
with some devision of length(filters)
. But how to I get all the produced plots in the grid.arrange()
? This should probably be outside the for loop, right? Any ideas here?
回答1:
You can do it by using eval
and parse
.
Also, a lapply
over a custom function sounds more reasonable than a for
loop with assign
. The result is a list of ggplot
objects.
To set all charts all together grid.arrange
from the gridExtra
package works fine. You just need to assign the list of your charts to the argument called grobs
.
library(dplyr)
library(ggplot2)
df <- data.frame(type = c("b1", "b2", "b1", "b2"),
yield = c(15, 10, 5, 0),
temperature = c("2", "21", "26", "13"),
Season = c("Winter", "Summer", "Summer", "Autumn"),
profit = c(TRUE, TRUE, FALSE, FALSE))
filters <- list("type=='b1'",
"profit",
"Season=='Summer'",
"profit==FALSE",
"yield >= 10",
"TRUE")
myfun <- function(fltr, df){
df <- filter(df, eval(parse(text = fltr)))
ggplot(df, aes(x = temperature, y = yield)) + geom_point()
}
ggs <- lapply(filters, myfun, df = df)
gridExtra::grid.arrange(grobs = ggs)
I made a couple of changes in your data: yield must be a numeric since you are using a filter applicable only to numeric vectors and the last filter (which was empty) is now equal to "TRUE" [I supposed you wanted to take everything in consideration]
回答2:
Rather than storing your filters, as character strings, it would be better to store them a quosures. For example
library(rlang)
filters <- quos(type=='b1',
profit,
Season=='Summer',
profit==FALSE,
yield >= 10,
TRUE)
Then you can fairly easily map over these with purrr::map
library(dplyr)
library(purrr)
library(ggplot2)
map(filters, ~df %>% filter(!!!.x) %>%
ggplot(aes(x = temperature, y = yield)) + geom_point())
回答3:
Assume the input data in the Note at the end which fixes up some inconsistencies in the data shown in the question, makes temperature and yield numeric and improves profit == FALSE
to just !profit
. Define a function Plot
which takes a filter, subsets df
and plots it. Then apply it to each filter
and use grid.arrange
. This uses ggplot2 and gridExtra but no additional packages and does not use eval
explicitly.
(An alternative to the grid.arrange
line would be cowplot::plot_grid(plotlist=plots)
which gives a slightly different layout.)
library(ggplot2)
library(gridExtra)
Plot <- function(x) {
data <- do.call("subset", list(df, parse(text = x)))
ggplot(data, aes(temperature, yield)) + geom_line() + geom_point() + ggtitle(x)
}
plots <- Map(Plot, filters)
do.call("grid.arrange", plots)
Note
df <- data.frame(brand = c("b1", "b2", "b1", "b2"),
yield = c(15, 10, 5, 0),
temperature = c(2, 21, 26, 13),
Season = c("Winter", "Summer", "Summer", "Autumn"),
profit = c(TRUE, TRUE, FALSE, FALSE))
filters <- c("brand=='b1'",
"profit",
"Season=='Summer'",
"!profit",
"yield >= 10",
TRUE)
来源:https://stackoverflow.com/questions/59273929/r-how-to-filter-data-with-a-list-of-arguments-to-produce-multiple-data-frames