Display percentage by column on a stacked bar graph

前端 未结 2 927
耶瑟儿~
耶瑟儿~ 2021-01-19 06:54

I\'m trying to plot a stacked bar chart showing the relative percentages of each group within a column.

Here\'s an illustration of my problem, using the default mpg

相关标签:
2条回答
  • 2021-01-19 07:09

    If the plot is in need of numbers and percentages as text on top of the coloured barplots, to help us to see the differences, maybe it is better to present results as a simple table:

    round(prop.table(table(mpg$class, mpg$manufacturer), margin = 2), 3) * 100
    
    #             audi chevrolet dodge  ford honda hyundai  jeep land rover lincoln mercury nissan pontiac subaru toyota volkswagen
    # 2seater      0.0      26.3   0.0   0.0   0.0     0.0   0.0        0.0     0.0     0.0    0.0     0.0    0.0    0.0        0.0
    # compact     83.3       0.0   0.0   0.0   0.0     0.0   0.0        0.0     0.0     0.0   15.4     0.0   28.6   35.3       51.9
    # midsize     16.7      26.3   0.0   0.0   0.0    50.0   0.0        0.0     0.0     0.0   53.8   100.0    0.0   20.6       25.9
    # minivan      0.0       0.0  29.7   0.0   0.0     0.0   0.0        0.0     0.0     0.0    0.0     0.0    0.0    0.0        0.0
    # pickup       0.0       0.0  51.4  28.0   0.0     0.0   0.0        0.0     0.0     0.0    0.0     0.0    0.0   20.6        0.0
    # subcompact   0.0       0.0   0.0  36.0 100.0    50.0   0.0        0.0     0.0     0.0    0.0     0.0   28.6    0.0       22.2
    # suv          0.0      47.4  18.9  36.0   0.0     0.0 100.0      100.0   100.0   100.0   30.8     0.0   42.9   23.5        0.0
    
    0 讨论(0)
  • 2021-01-19 07:18

    If I compare your question to the link you gave than the difference is that the link "counted" them selves. That's what I did. I'am nor sure if this is than suitable for your real data.

    library(ggplot2)
    library(dplyr)
    
    mpg %>%
      mutate(manufacturer = as.factor(manufacturer),
             class = as.factor(class)) %>%
      group_by(manufacturer, class) %>%
      summarise(count_class = n()) %>%
      group_by(manufacturer) %>%
      mutate(count_man = sum(count_class)) %>%
      mutate(percent = count_class / count_man * 100) %>%
      ggplot() +
      geom_bar(aes(x = manufacturer,
                   y = count_man, 
                   group = class,
                   fill = class), 
               stat = "identity") +
      geom_text(aes(x = manufacturer,
                    y = count_man,
                    label = sprintf("%0.1f%%", percent)),
                position = position_stack(vjust = 0.5))
    

    Edit, based on comment :

    I made a mistake by selecting the wrong column for y

    library(ggplot2)
    library(dplyr)
    
    mpg %>%
      mutate(manufacturer = as.factor(manufacturer),
             class = as.factor(class)) %>%
      group_by(manufacturer, class) %>%
      summarise(count_class = n()) %>%
      group_by(manufacturer) %>%
      mutate(count_man = sum(count_class)) %>%
      mutate(percent = count_class / count_man * 100) %>%
      ungroup() %>%
      ggplot(aes(x = manufacturer,
                 y = count_class,
                 group = class)) +
      geom_bar(aes(fill = class), 
               stat = "identity") +
      geom_text(aes(label = sprintf("%0.1f%%", percent)),
                position = position_stack(vjust = 0.5))
    
    0 讨论(0)
提交回复
热议问题