How to reorder a factor based on a subset (facets) of another variable, using forcats?

≯℡__Kan透↙ 提交于 2020-06-27 13:40:18

问题


forcats vignette states that

The goal of the forcats package is to provide a suite of useful tools that solve common problems with factors

And indeed one of the tools is to reorder factors by another variable, which is a very common use case in plotting data. I was trying to use forcats to accomplish this, but in the case of a faceted plot. That is, I want to reorder a factor by other variable, but using only a subset of the data. Here's a reprex:

library(tidyverse)

ggplot2::diamonds %>% 
    group_by(cut, clarity) %>% 
    summarise(value = mean(table, na.rm = TRUE)) %>%
    ggplot(aes(x = clarity, y = value, color = clarity)) + 
    geom_segment(aes(xend = clarity, y = min(value), yend = value), 
                 size = 1.5, alpha = 0.5) + 
    geom_point(size = 3) + 
    facet_grid(rows = "cut", scales = "free") +
    coord_flip() +
    theme(legend.position = "none")

This code produces the plot close to what I want:

But I want the clarity axis to be sorted by value, so I can quickly spot which clarity has the highest value. But then each facet would imply a different order. So I'd like to choose to order the plot by the values within a specific facet.

The straightforward use of forcats, of course, does not work in this case, 'cause it would reorder the factor based on all the values, and not only the values of a specific facet. Let's do it:

# Inserting this line right before the ggplot call
mutate(clarity = forcats::fct_reorder(clarity, value)) %>%

It then produces this plot.

Of course, it reordered the factor based on the whole data, but what if I want the plot ordered by the values of the "Ideal" cut?, How can I do this with forcats?

My current solution would be as follows:

ggdf <- ggplot2::diamonds %>% 
    group_by(cut, clarity) %>% 
    summarise(value = mean(table, na.rm = TRUE))

# The trick would be to create an auxiliary factor using only
# the subset of the data I want, and then use the levels
# to reorder the factor in the entire dataset.
#
# Note that I use good-old reorder, and not the forcats version
# which I could have, but better this way to emphasize that
# so far I haven't found the advantage of using forcats 
reordered_factor <- reorder(ggdf$clarity[ggdf$cut == "Ideal"], 
                            ggdf$value[ggdf$cut == "Ideal"])

ggdf$clarity <- factor(ggdf$clarity, levels = levels(reordered_factor))

ggdf %>%
    ggplot(aes(x = clarity, y = value, color = clarity)) + 
    geom_segment(aes(xend = clarity, y = min(value), yend = value), 
                 size = 1.5, alpha = 0.5) + 
    geom_point(size = 3) + 
    facet_grid(rows = "cut", scales = "free") +
    coord_flip() +
    theme(legend.position = "none")

Which produces what I want.

But I wonder if there is a more elegant/clever way to do it using forcats.


回答1:


If you want to reorder clarity by the values of a particular facet you have to tell forcats::fct_reorder() to do so, e.g.,

mutate(clarity = forcats::fct_reorder(
    clarity, filter(., cut == "Ideal") %>% pull(value)))

which uses only the values for the "Ideal" facet for reordering.

Thus,

ggplot2::diamonds %>% 
  group_by(cut, clarity) %>% 
  summarise(value = mean(table, na.rm = TRUE)) %>%
  mutate(clarity = forcats::fct_reorder(
    clarity, filter(., cut == "Ideal") %>% pull(value))) %>%
  ggplot(aes(x = clarity, y = value, color = clarity)) + 
  geom_segment(aes(xend = clarity, y = min(value), yend = value), 
               size = 1.5, alpha = 0.5) + 
  geom_point(size = 3) + 
  facet_grid(rows = "cut", scales = "free") +
  coord_flip() +
  theme(legend.position = "none")

creates

as requested.



来源:https://stackoverflow.com/questions/54430898/how-to-reorder-a-factor-based-on-a-subset-facets-of-another-variable-using-fo

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!