问题
forcats
vignette states that
The goal of the forcats package is to provide a suite of useful tools that solve common problems with factors
And indeed one of the tools is to reorder factors by another variable, which is a very common use case in plotting data. I was trying to use forcats
to accomplish this, but in the case of a faceted plot. That is, I want to reorder a factor by other variable, but using only a subset of the data. Here's a reprex:
library(tidyverse)
ggplot2::diamonds %>%
group_by(cut, clarity) %>%
summarise(value = mean(table, na.rm = TRUE)) %>%
ggplot(aes(x = clarity, y = value, color = clarity)) +
geom_segment(aes(xend = clarity, y = min(value), yend = value),
size = 1.5, alpha = 0.5) +
geom_point(size = 3) +
facet_grid(rows = "cut", scales = "free") +
coord_flip() +
theme(legend.position = "none")
This code produces the plot close to what I want:
But I want the clarity axis to be sorted by value, so I can quickly spot which clarity has the highest value. But then each facet would imply a different order. So I'd like to choose to order the plot by the values within a specific facet.
The straightforward use of forcats
, of course, does not work in this case, 'cause it would reorder the factor based on all the values, and not only the values of a specific facet. Let's do it:
# Inserting this line right before the ggplot call
mutate(clarity = forcats::fct_reorder(clarity, value)) %>%
It then produces this plot.
Of course, it reordered the factor based on the whole data, but what if I want the plot ordered by the values of the "Ideal" cut?, How can I do this with forcats
?
My current solution would be as follows:
ggdf <- ggplot2::diamonds %>%
group_by(cut, clarity) %>%
summarise(value = mean(table, na.rm = TRUE))
# The trick would be to create an auxiliary factor using only
# the subset of the data I want, and then use the levels
# to reorder the factor in the entire dataset.
#
# Note that I use good-old reorder, and not the forcats version
# which I could have, but better this way to emphasize that
# so far I haven't found the advantage of using forcats
reordered_factor <- reorder(ggdf$clarity[ggdf$cut == "Ideal"],
ggdf$value[ggdf$cut == "Ideal"])
ggdf$clarity <- factor(ggdf$clarity, levels = levels(reordered_factor))
ggdf %>%
ggplot(aes(x = clarity, y = value, color = clarity)) +
geom_segment(aes(xend = clarity, y = min(value), yend = value),
size = 1.5, alpha = 0.5) +
geom_point(size = 3) +
facet_grid(rows = "cut", scales = "free") +
coord_flip() +
theme(legend.position = "none")
Which produces what I want.
But I wonder if there is a more elegant/clever way to do it using forcats
.
回答1:
If you want to reorder clarity
by the values of a particular facet you have to tell forcats::fct_reorder()
to do so, e.g.,
mutate(clarity = forcats::fct_reorder(
clarity, filter(., cut == "Ideal") %>% pull(value)))
which uses only the values for the "Ideal" facet for reordering.
Thus,
ggplot2::diamonds %>%
group_by(cut, clarity) %>%
summarise(value = mean(table, na.rm = TRUE)) %>%
mutate(clarity = forcats::fct_reorder(
clarity, filter(., cut == "Ideal") %>% pull(value))) %>%
ggplot(aes(x = clarity, y = value, color = clarity)) +
geom_segment(aes(xend = clarity, y = min(value), yend = value),
size = 1.5, alpha = 0.5) +
geom_point(size = 3) +
facet_grid(rows = "cut", scales = "free") +
coord_flip() +
theme(legend.position = "none")
creates
as requested.
来源:https://stackoverflow.com/questions/54430898/how-to-reorder-a-factor-based-on-a-subset-facets-of-another-variable-using-fo