How can I use stat_summary
to label a plot with n = x
where is x
a variable? Here\'s an example of the desired output:
You can make your own function to use inside the stat_summary()
. Here n_fun
calculate place of y value as median()
and then add label=
that consist of n=
and number of observations. It is important to use data.frame()
instead of c()
because paste0()
will produce character but y
value is numeric, but c()
would make both character. Then in stat_summary()
use this function and geom="text"
. This will ensure that for each x value position and label is made only from this level's data.
n_fun <- function(x){
return(data.frame(y = median(x), label = paste0("n = ",length(x))))
}
ggplot(mtcars, aes(factor(cyl), mpg, label=rownames(mtcars))) +
geom_boxplot(fill = "grey80", colour = "#3366FF") +
stat_summary(fun.data = n_fun, geom = "text")
Most things in R
are vectorized, so you can leverage that.
nlabels <- table(mtcars$cyl)
# To create the median labels, you can use by
meds <- c(by(mtcars$mpg, mtcars$cyl, median))
ggplot(mtcars, aes(factor(cyl), mpg, label=rownames(mtcars))) +
geom_boxplot(fill = "grey80", colour = "#3366FF") +
geom_text(data = data.frame(), aes(x = names(meds) , y = meds,
label = paste("n =", nlabels)))
nlables
:Instead of your sapply
statement you can simply use:
nlabels <- table(mtcars$cyl)
Notice that your current code is taking the above, converting it, transposing it, then iterating over each row only to grab the values one by one, then put them back together into a single vector.
If you really want them as an un-dimensioned integer vector, use c()
nlabels <- c(table(mtcars$cyl))
but of course, even this is not needed to accomplish the above.