doing facets in ggplot I would often like the percentage to be used instead of counts.
e.g.
test1 <- sample(letters[1:2], 100, replace=T)
test2 &
Try this:
# first make a dataframe with frequencies
df <- as.data.frame(with(test, table(test1,test2)))
# or with count() from plyr package as Hadley suggested
df <- count(test, vars=c('test1', 'test2'))
# next: compute percentages per group
df <- ddply(df, .(test1), transform, p = Freq/sum(Freq))
# and plot
ggplot(df, aes(test2, p))+geom_bar()+facet_grid(~test1)
You could also add + scale_y_continuous(formatter = "percent")
to the plot for ggplot2 version 0.8.9, or + scale_y_continuous(labels = percent_format())
for version 0.9.0.
I deal with similar situations quite frequently, but take a very different approach that uses two of Hadley's other packages, namely reshape and plyr. Primarily because I have a preference for looking at things as 100% stacked bars (when they total to 100%).
test <- data.frame(sample(letters[1:2], 100, replace=T), sample(letters[3:8], 100, replace=T))
colnames(test) <- c("variable","value")
test <- cast(test, variable + value ~ .)
colnames(test)[3] <- "frequ"
test <- ddply(test,"variable", function(x) {
x <- x[order(x$value),]
x$cfreq <- cumsum(x$frequ)/sum(x$frequ)
x$pos <- (c(0,x$cfreq[-nrow(x)])+x$cfreq)/2
x$freq <- (x$frequ)/sum(x$frequ)
x
})
plot.tmp <- ggplot(test, aes(variable,frequ, fill=value)) + geom_bar(stat="identity", position="fill") + coord_flip() + scale_y_continuous("", formatter="percent")
A very simple way:
ggplot(test, aes(test2)) +
geom_bar(aes(y = (..count..)/sum(..count..))) +
facet_grid(~test1)
So I only changed the parameter of geom_bar to aes(y = (..count..)/sum(..count..))
.
After setting ylab to NULL and specifying the formatter, you could get:
ggplot(test, aes(test2)) +
geom_bar(aes(y = (..count..)/sum(..count..))) +
facet_grid(~test1) +
scale_y_continuous('', formatter="percent")
Update
Note that while formatter = "percent")
works for ggplot2 version 0.8.9, in 0.9.0 you'd want something like scale_y_continuous(labels = percent_format())
.
Here is a within ggplot
method, using ..count..
and ..PANEL..
:
ggplot(test, aes(test2)) +
geom_bar(aes(y = (..count..)/tapply(..count..,..PANEL..,sum)[..PANEL..])) +
facet_grid(~test1)
As this is computed on the fly, it should be robust to changes to plot parameters.
Here's a solution that should get you moving in the right direction. I'm curious to see if there are more efficient ways to go about doing this as this seems a bit hacky and convoluted. We can use the built in ..density..
argument for the y aesthetic
, but factors don't work there. So we also need to use scale_x_discrete
to appropriately label the axis once we converted test2
into a numeric object.
ggplot(data = test, aes(x = as.numeric(test2)))+
geom_bar(aes(y = ..density..), binwidth = .5)+
scale_x_discrete(limits = sort(unique(test$test2))) +
facet_grid(~test1) + xlab("Test 2") + ylab("Density")
But give this a whirl and let me know what you think.
Also, you can shorten your test data creation like so, which avoids the extra objects in your environment and having to cbind them together:
test <- data.frame(
test1 = sample(letters[1:2], 100, replace = TRUE),
test2 = sample(letters[3:8], 100, replace = TRUE)
)
Thank you for sharing the PANEL "tip" on the ggplot
method.
For information: you can produce percentages in y lab
, on the same bar chart, by using count
and group
in the ggplot
method:
ggplot(test, aes(test2,fill=test1))
+ geom_bar(aes(y = (..count..)/tapply(..count..,..group..,sum)[..group..]), position="dodge")
+ scale_y_continuous(labels = percent)