percentage on y lab in a faceted ggplot barchart?

前端 未结 6 1295
孤独总比滥情好
孤独总比滥情好 2020-12-01 00:56

doing facets in ggplot I would often like the percentage to be used instead of counts.

e.g.

test1 <- sample(letters[1:2], 100, replace=T)
test2 &         


        
相关标签:
6条回答
  • 2020-12-01 01:18

    Try this:

    # first make a dataframe with frequencies
    df <- as.data.frame(with(test, table(test1,test2)))
    # or with count() from plyr package as Hadley suggested
    df <- count(test, vars=c('test1', 'test2'))
    # next: compute percentages per group
    df <- ddply(df, .(test1), transform, p = Freq/sum(Freq))
    # and plot
    ggplot(df, aes(test2, p))+geom_bar()+facet_grid(~test1)
    

    alt text

    You could also add + scale_y_continuous(formatter = "percent") to the plot for ggplot2 version 0.8.9, or + scale_y_continuous(labels = percent_format()) for version 0.9.0.

    0 讨论(0)
  • 2020-12-01 01:22

    I deal with similar situations quite frequently, but take a very different approach that uses two of Hadley's other packages, namely reshape and plyr. Primarily because I have a preference for looking at things as 100% stacked bars (when they total to 100%).

    test <- data.frame(sample(letters[1:2], 100, replace=T), sample(letters[3:8], 100, replace=T))
    colnames(test) <- c("variable","value")
    test <- cast(test, variable + value ~ .) 
    colnames(test)[3] <- "frequ"
    
    test <- ddply(test,"variable", function(x) {
        x <- x[order(x$value),]
        x$cfreq <- cumsum(x$frequ)/sum(x$frequ)
        x$pos <- (c(0,x$cfreq[-nrow(x)])+x$cfreq)/2
        x$freq <- (x$frequ)/sum(x$frequ)
        x
    })
    
    plot.tmp <- ggplot(test, aes(variable,frequ, fill=value)) + geom_bar(stat="identity", position="fill") + coord_flip() + scale_y_continuous("", formatter="percent")
    
    0 讨论(0)
  • 2020-12-01 01:24

    A very simple way:

    ggplot(test, aes(test2)) + 
        geom_bar(aes(y = (..count..)/sum(..count..))) + 
        facet_grid(~test1)
    

    So I only changed the parameter of geom_bar to aes(y = (..count..)/sum(..count..)). After setting ylab to NULL and specifying the formatter, you could get:

    ggplot(test, aes(test2)) +
        geom_bar(aes(y = (..count..)/sum(..count..))) + 
        facet_grid(~test1) +
        scale_y_continuous('', formatter="percent")
    

    Update Note that while formatter = "percent") works for ggplot2 version 0.8.9, in 0.9.0 you'd want something like scale_y_continuous(labels = percent_format()). alt text

    0 讨论(0)
  • 2020-12-01 01:27

    Here is a within ggplot method, using ..count.. and ..PANEL..:

    ggplot(test, aes(test2)) + 
        geom_bar(aes(y = (..count..)/tapply(..count..,..PANEL..,sum)[..PANEL..])) + 
        facet_grid(~test1)
    

    As this is computed on the fly, it should be robust to changes to plot parameters.

    0 讨论(0)
  • 2020-12-01 01:42

    Here's a solution that should get you moving in the right direction. I'm curious to see if there are more efficient ways to go about doing this as this seems a bit hacky and convoluted. We can use the built in ..density.. argument for the y aesthetic, but factors don't work there. So we also need to use scale_x_discrete to appropriately label the axis once we converted test2 into a numeric object.

    ggplot(data = test, aes(x = as.numeric(test2)))+ 
    geom_bar(aes(y = ..density..), binwidth = .5)+ 
    scale_x_discrete(limits = sort(unique(test$test2))) + 
    facet_grid(~test1) + xlab("Test 2") + ylab("Density") 
    

    But give this a whirl and let me know what you think.

    Also, you can shorten your test data creation like so, which avoids the extra objects in your environment and having to cbind them together:

    test <- data.frame(
        test1 = sample(letters[1:2], 100, replace = TRUE), 
        test2 = sample(letters[3:8], 100, replace = TRUE)
    )
    
    0 讨论(0)
  • 2020-12-01 01:42

    Thank you for sharing the PANEL "tip" on the ggplot method.

    For information: you can produce percentages in y lab, on the same bar chart, by using count and group in the ggplot method:

    ggplot(test, aes(test2,fill=test1))
       + geom_bar(aes(y = (..count..)/tapply(..count..,..group..,sum)[..group..]), position="dodge")
       + scale_y_continuous(labels = percent)
    
    0 讨论(0)
提交回复
热议问题