问题
I have a few different vectors of varying length for which I would like to generate side by side boxplots using ggplot2. This is relatively straight forward to do with the Base plotting system. However ggplot2 only takes a single data frame as input, which is difficult to create from data of varying lengths.
a <- rnorm(10)
b <- rnorm(100)
c <- rnorm(1000)
boxplot(a, b, c)
Q: What is the correct way to draw boxplots using ggplot2 using data of varying lengths?
回答1:
ggplot uses tidy long data frames with groups (like a, b, or c) saved as separate columns. In your example, you can make a data frame with 1110 rows (10 + 100 + 1000) and two columns (the value and the group), like so:
# Make individual data frames
a <- data.frame(group = "a", value = rnorm(10))
b <- data.frame(group = "b", value = rnorm(100))
c <- data.frame(group = "c", value = rnorm(1000))
# Combine into one long data frame
plot.data <- rbind(a, b, c)
# group value
# 1 a 0.2322682
# 2 a -0.9681992
# ...
# 101 b 0.3422354
# 102 b 0.3495342
# ...
# 1001 c -0.6839231
# 1002 c -1.4329843
# Plot
library(ggplot2)
ggplot(plot.data, aes(x=group, y=value, fill=group)) + geom_boxplot()
来源:https://stackoverflow.com/questions/26918358/ggplot2-multiple-boxplots-from-sources-of-different-lengths