Multi-group histogram with group-specific frequencies

假装没事ソ 提交于 2020-01-06 15:42:11

问题


First off, I've already read the following thread: ggplot2 - Multi-group histogram with in-group proportions rather than frequency

I followed the ddply suggestion and it didn't seem to work for my data. Logically the code should work perfectly on my dataset and I have no idea what I'm doing wrong.

Overall: I'd like to make a histogram (I'm learning ggplot) that displays the genotype frequency in each of my study groups.

Something like this:

Here's a mock data set that mirrors my own:

df<-data.frame(ID=1:60,
               Genotypes=sample(c("CG", "CC", "GG"), 60, replace=T),
               Study_Group=sample(c("Control", "Pathology1", "pathology2"), 60, replace=T))

I've tried variants of p + geom_bar(aes(aes(y = ..count../sum(..count..)) but r returns "cannot find 'count' object" or something to that effect.

I also tried:

df.new<-ddply(df,.(Study_Group),summarise,
              prop=prop.table(table(df$Genotype)),
              Genotype=names(table(df$Genotype)))`

And I believe there was an error with the summarise function, but to be honest, I have no idea what I'm doing.

Is the problem simply my comprehension of the solution or is it something inherently different in my data set?

Thanks for the help.


回答1:


Give this a try. In this, I am using dplyr which is a package that contains updated versions of the ddply-type functions from plyr. One thing, I am not sure if you want to have your x-axis be the Study_Groups or your Genotypes. your question states you want the frequency of Genotype within each group but your graph has the Genotypes on the x. The solution follows the stated desire, not the plot. However, making the change to get Genotype on the x is simple. I'll note in the code comments where and what change to make.

library(dplyr)
library(ggplot2)

df2 <- df %>%
  count(Study_Group, Genotypes) %>%
  group_by(Study_Group) %>% #change to `group_by(Genotypes) %>%` for alternative approach
  mutate(prop = n / sum(n))

ggplot(data = df2, aes(Study_Group, prop, fill = Genotypes)) + 
  geom_bar(stat = "identity", position = "dodge")



来源:https://stackoverflow.com/questions/41030350/multi-group-histogram-with-group-specific-frequencies

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!