R ggplot2 boxplots - ggpubr stat_compare_means not working properly

前端 未结 1 1916
星月不相逢
星月不相逢 2020-12-06 07:37

I am trying to add significance levels to my boxplots in the form of asterisks using ggplot2 and the ggpubr package, but I

相关标签:
1条回答
  • 2020-12-06 08:28

    Edit: Since I discovered the rstatix package I would do:

    set.seed(123)
    #test df
    mydf <- data.frame(ID=paste(sample(LETTERS, 163, replace=TRUE), sample(1:1000, 163, replace=FALSE), sep=''),
                       Group=c(rep('C',10),rep('FH',10),rep('I',19),rep('IF',42),rep('NA',14),rep('NF',42),rep('NI',15),rep('NS',10),rep('PGMC4',1)),
                       Value=c(runif(n=100), runif(63,max= 0.5)))
    
    
    library(tidyverse)
    
    stat_pvalue <- mydf %>% 
     rstatix::wilcox_test(Value ~ Group) %>%
     filter(p < 0.05) %>% 
     rstatix::add_significance("p") %>% 
     rstatix::add_y_position() %>% 
     mutate(y.position = seq(min(y.position), max(y.position),length.out = n())
    
    ggplot(mydf, aes(x=Group, y=Value)) + geom_boxplot() +
      ggpubr::stat_pvalue_manual(stat_pvalue, label = "p.signif") +
      theme_bw(base_size = 16)
    

    Old Answer:

    You can try following. The idea is that you calculate the stats by your own using pairwise.wilcox.test. Then you use the ggsignif function geom_signif to add the precalculated pvalues. With y_position you can place the brackets so they don't overlap.

    library(tidyverse)
    library(ggsignif)
    library(broom)
    # your list of combinations you want to compare
    CN <- combn(levels(mydf$Group)[-9], 2, simplify = FALSE)
    # the pvalues. I use broom and tidy to get a nice formatted dataframe. Note, I turned off the adjustment of the pvalues. 
    pv <- tidy(with(mydf[ mydf$Group != "PGMC4", ], pairwise.wilcox.test(Value, Group, p.adjust.method = "none")))
    #  data preparation 
    CN2 <- do.call(rbind.data.frame, CN)
    colnames(CN2) <- colnames(pv)[-3]
    # subset the pvalues, by merging the CN list
    pv_final <- merge(CN2, pv, by.x = c("group2", "group1"), by.y = c("group1", "group2"))
    # fix ordering
    pv_final <- pv_final[order(pv_final$group1), ] 
    # set signif level
    pv_final$map_signif <- ifelse(pv_final$p.value > 0.05, "", ifelse(pv_final$p.value > 0.01,"*", "**"))  
    
    # the plot
    ggplot(mydf, aes(x=Group, y=Value, fill=Group)) + geom_boxplot() +
      stat_compare_means(data=mydf[ mydf$Group != "PGMC4", ], aes(x=Group, y=Value, fill=Group), size=5) + 
      ylim(-4,30)+
      geom_signif(comparisons=CN,
                  y_position = 3:30, annotation= pv_final$map_signif) + 
      theme_bw(base_size = 16)
    

    The arguments vjust, textsize, and size are not properly working. Seems to be a bug in the latest version ggsignif_0.3.0.


    Edit: When you want to show only the significant comparisons, you can easily subset the dataset CN. Since I updated to ggsignif_0.4.0 and R version 3.4.1, vjust and textsize are working now as expected. Instead of y_position you can try step_increase.

    # subset 
    gr <- pv_final$p.value <= 0.05
    CN[gr]
    
    ggplot(mydf, aes(x=Group, y=Value, fill=Group)) + 
      geom_boxplot() +
      stat_compare_means(data=mydf[ mydf$Group != "PGMC4", ], aes(x=Group, y=Value, fill=Group), size=5) + 
      geom_signif(comparisons=CN[gr], textsize = 12, vjust = 0.7, 
                 step_increase=0.12, annotation= pv_final$map_signif[gr]) + 
      theme_bw(base_size = 16)
    

    You can use ggpubr as well. Add:

    stat_compare_means(comparisons=CN[gr], method="wilcox.test", label="p.signif", color="red")
    

    0 讨论(0)
提交回复
热议问题