I am plotting histograms using geom_histogram and I would like to label each histogram with the mean value (I am using mean for the sake of this example). The issue is that I am drawing multiple histograms in one facet and I get labels overlapping. This is an example:
df <- data.frame (type=rep(1:2, each=1000), subtype=rep(c("a","b"), each=500), value=rnorm(4000, 0,1))
plt <- ggplot(df, aes(x=value, fill=subtype)) + geom_histogram(position="identity", alpha=0.4)
plt <- plt + facet_grid(. ~ type)
plt + geom_text(aes(label = paste("mean=", mean(value)), colour=subtype, x=-Inf, y=Inf), data = df, size = 4, hjust=-0.1, vjust=2)
Result is:
The problem is that the labels for Subtypes a and b are overlapping. I would like to solve this.
I have tried the position, both dodge and stack, for example:
plt + geom_text(aes(label = paste("mean=", mean(value)), colour=subtype, x=-Inf, y=Inf), position="stack", data = df, size = 4, hjust=-0.1, vjust=2)
This did not help. In fact, it issued warning about the width.
Would you pls help ? Thx, Riad.
I think you could precalculate mean values before plotting in new data frame.
type subtype mean.value
1 1 a -0.003138127
2 1 b 0.023252169
3 2 a 0.030831337
4 2 b -0.059001888
Then use this new data frame in geom_text()
. To ensure that values do not overlap you can provide two values in vjust=
(as there are two values in each facet).
ggplot(df, aes(x=value, fill=subtype)) +
geom_histogram(position="identity", alpha=0.4)+
facet_grid(. ~ type)+
colour=subtype,x=-Inf,y=Inf), size = 4, hjust=-0.1, vjust=c(2,4))
Just to expand on @Didzis:
You actually have two problems here. First, the text overlaps, but more importantly, when you use aggregating functions in aes(...)
, as in:
geom_text(aes(label = paste("mean=", mean(value)), ...
does not respect the subsetting implied in the facets (or in the groups for that matter). So mean(value)
is based on the full dataset regardless of faceting or grouping. As a result, you have to use an auxillary table, as @Didzis shows.
df.text <- aggregate(df$value,by=list(type=df$type,subtype=df$subtype),mean)
gets you the means and does not require plyr