问题
First off, I've already read the following thread: ggplot2 - Multi-group histogram with in-group proportions rather than frequency
I followed the ddply suggestion and it didn't seem to work for my data. Logically the code should work perfectly on my dataset and I have no idea what I'm doing wrong.
Overall: I'd like to make a histogram (I'm learning ggplot) that displays the genotype frequency in each of my study groups.
Something like this:
Here's a mock data set that mirrors my own:
df<-data.frame(ID=1:60,
Genotypes=sample(c("CG", "CC", "GG"), 60, replace=T),
Study_Group=sample(c("Control", "Pathology1", "pathology2"), 60, replace=T))
I've tried variants of p + geom_bar(aes(aes(y = ..count../sum(..count..))
but r returns "cannot find 'count' object" or something to that effect.
I also tried:
df.new<-ddply(df,.(Study_Group),summarise,
prop=prop.table(table(df$Genotype)),
Genotype=names(table(df$Genotype)))`
And I believe there was an error with the summarise function, but to be honest, I have no idea what I'm doing.
Is the problem simply my comprehension of the solution or is it something inherently different in my data set?
Thanks for the help.
回答1:
Give this a try. In this, I am using dplyr
which is a package that contains updated versions of the ddply
-type functions from plyr
. One thing, I am not sure if you want to have your x-axis be the Study_Group
s or your Genotypes
. your question states you want the frequency of Genotype
within each group but your graph has the Genotypes
on the x. The solution follows the stated desire, not the plot. However, making the change to get Genotype
on the x is simple. I'll note in the code comments where and what change to make.
library(dplyr)
library(ggplot2)
df2 <- df %>%
count(Study_Group, Genotypes) %>%
group_by(Study_Group) %>% #change to `group_by(Genotypes) %>%` for alternative approach
mutate(prop = n / sum(n))
ggplot(data = df2, aes(Study_Group, prop, fill = Genotypes)) +
geom_bar(stat = "identity", position = "dodge")
来源:https://stackoverflow.com/questions/41030350/multi-group-histogram-with-group-specific-frequencies