问题
I would like to barplot in ggplot2 a categorical variable grouped according a second categorical variable and use facet_wrap to divide them in different plots. Than I would show percentage of each. Here a reproducible example
test <- data.frame(
test1 = sample(letters[1:2], 100, replace = TRUE),
test2 = sample(letters[3:5], 100, replace = TRUE),
test3 = sample(letters[9:11],100, replace = TRUE )
)
ggplot(test, aes(x=factor(test1))) +
geom_bar(aes(fill=factor(test2), y=..prop.., group=factor(test2)), position="dodge") +
facet_wrap(~factor(test3))+
scale_y_continuous("Percentage (%)", limits = c(0, 1), breaks = seq(0, 1, by=0.1), labels = percent)+
scale_x_discrete("")+
theme(plot.title = element_text(hjust = 0.5), panel.grid.major.x = element_blank())
This give me a barplot with the percentage of test2 according test1 in each test3. I would like to show the percentage of each bar on the top. Moreover, I would like to change the name of the legend in the right from factor(test2) in Test2.
回答1:
It may be easiest to do the data summary yourself so that you can create a column with the percentage labels you want. (Note that as is, I'm not sure what you want your percentages to show- in facet i, group b, there is a column that is nearly 90%, and two columns that are greater than or equal to 50%- is that intended?)
Libraries and your example data frame:
library(ggplot2)
library(dplyr)
test <- data.frame(
test1 = sample(letters[1:2], 100, replace = TRUE),
test2 = sample(letters[3:5], 100, replace = TRUE),
test3 = sample(letters[9:11],100, replace = TRUE )
)
First, group by all columns (note the order), then summarize to get the length
of test2
. Mutate
to get a value for the column height and label-
here I've multiplied by 100 and rounded.
test.grouped <- test %>%
group_by(test1, test3, test2) %>%
summarize(t2.len = length(test2)) %>%
mutate(t2.prop = round(t2.len / sum(t2.len) * 100, 1))
> test.grouped
# A tibble: 18 x 5
# Groups: test1, test3 [6]
test1 test3 test2 t2.len t2.prop
<fctr> <fctr> <fctr> <int> <dbl>
1 a i c 4 30.8
2 a i d 5 38.5
3 a i e 4 30.8
4 a j c 3 20.0
5 a j d 8 53.3
...
Use the summarized data to build your plot, using geom_text
to use the proportion column as the label:
ggplot(test.grouped, aes(x = test1,
y = t2.prop,
fill = test2,
group = test2)) +
geom_bar(stat = "identity", position = position_dodge(width = 0.9)) +
geom_text(aes(label = paste(t2.prop, "%", sep = ""),
group = test2),
position = position_dodge(width = 0.9),
vjust = -0.8)+
facet_wrap(~ test3) +
scale_y_continuous("Percentage (%)") +
scale_x_discrete("") +
theme(plot.title = element_text(hjust = 0.5), panel.grid.major.x = element_blank())
来源:https://stackoverflow.com/questions/47478138/barplot-with-ggplot-2-of-two-categorical-variable-facet-wrap-according-a-third-v