ggplot::geom_boxplot() How to change the width of one box group in R

前端未结

关注

 2  1840

野趣味 2021-01-13 01:06

I want to adapt the width of the box in the category \"random\" to the same width of the other boxes in the plot. It is now a single group, whereas the other groups contain

2条回答

广开言路 (楼主)

2021-01-13 01:27

The second solution here can be modified to suit your case:

Step 1. Add fake data to dataset using complete from the tidyr package:

TablePerCatchmentAndYear2 <- TablePerCatchmentAndYear %>% 
  dplyr::select(NoiseType, TempRes, POA) %>%
  tidyr::complete(NoiseType, TempRes, fill = list(POA = 100))
# 100 is arbitrarily chosen here as a very large value beyond the range of 
# POA values in the boxplot

Step 2. Plot, but setting y-axis limits within coord_cartesian:

ggplot(dat2,aes(x=NoiseType, y= POA, fill = TempRes)) + 
  geom_boxplot(lwd=0.05) + coord_cartesian(ylim = c(-1.25, 1)) + theme(legend.position='bottom') + 
  ggtitle('title')+ scale_fill_discrete(name = '')

Reason for this is that setting the limits using the ylim() command would have caused the empty boxplot space for weekly random noise type to disappear. The help file for ylim states:

Note that, by default, any values outside the limits will be replaced with NA.

While the help file for coord_cartesian states:

Setting limits on the coordinate system will zoom the plot (like you're looking at it with a magnifying glass), and will not change the underlying data like setting limits on a scale will.

Alternative solution

This will keep all boxes at the same width, regardless whether there were different number of factor levels associated with each category along the x-axis. It achieves this by flattening the hierarchical nature of the "x variable"~"fill factor variable" relationship, so that each combination of "x variable"~"fill factor variable" is given equal weight (& hence width) in the boxplot.

Step 1. Define the position of each boxplot along the x-axis, taking x-axis as numeric rather than categorical:

TablePerCatchmentAndYear3 <- TablePerCatchmentAndYear %>%
  mutate(NoiseType.Numeric = as.numeric(factor(NoiseType))) %>%
  mutate(NoiseType.Numeric = NoiseType.Numeric + case_when(NoiseType != "random" & TempRes == "hourly" ~ -0.2,
                                                           NoiseType != "random" & TempRes == "weekly" ~ +0.2,
                                                           TRUE ~ 0))

# check the result
TablePerCatchmentAndYear3 %>% 
  select(NoiseType, TempRes, NoiseType.Numeric) %>% 
  unique() %>% arrange(NoiseType.Numeric)

        NoiseType TempRes NoiseType.Numeric
1           bench  hourly               0.8
2           bench  weekly               1.2
3 LogNormSDdivBy1  hourly               1.8
4 LogNormSDdivBy1  weekly               2.2
5 LogNormSDdivBy2  hourly               2.8
6 LogNormSDdivBy2  weekly               3.2
7 LogNormSDdivBy4  hourly               3.8
8 LogNormSDdivBy4  weekly               4.2
9          random  hourly               5.0

Step 2. Plot, labeling the numeric x-axis with categorical labels:

ggplot(TablePerCatchmentAndYear3,
       aes(x = NoiseType.Numeric, y = POA, fill = TempRes, group = NoiseType.Numeric)) +
  geom_boxplot() +
  scale_x_continuous(name = "NoiseType", breaks = c(1, 2, 3, 4, 5), minor_breaks = NULL,
                     labels = sort(unique(dat$NoiseType)), expand = c(0, 0)) + 
  coord_cartesian(ylim = c(-1.25, 1), xlim = c(0.5, 5.5)) + 
  theme(legend.position='bottom') + 
  ggtitle('title')+ scale_fill_discrete(name = '')

Note: Personally, I wouldn't recommend this solution. It's difficult to automate / generalize as it requires different manual adjustments depending on the number of fill variable levels present. But if you really need this for a one-off use case, it's here.

0 讨论(0)

查看其它2个回答