create a boxplot in R that labels a box with the sample size (N)

后端 未结 5 984
囚心锁ツ
囚心锁ツ 2021-02-01 22:05

Is there a way to create a boxplot in R that will display with the box (somewhere) an \"N=(sample size)\"? The varwidth logical adjusts the width of the box on the basis of sam

相关标签:
5条回答
  • 2021-02-01 22:43

    Here's some ggplot2 code. It's going to display the sample size at the sample mean, making the label multifunctional!

    First, a simple function for fun.data

    give.n <- function(x){
       return(c(y = mean(x), label = length(x)))
    }
    

    Now, to demonstrate with the diamonds data

    ggplot(diamonds, aes(cut, price)) + 
       geom_boxplot() + 
       stat_summary(fun.data = give.n, geom = "text")
    

    You may have to play with the text size to make it look good, but now you have a label for the sample size which also gives a sense of the skew.

    0 讨论(0)
  • 2021-02-01 22:46

    I figured out a workaround using the Envstats package. This package needs to be downloaded, loaded and activated using:

    library(Envstats)
    

    The stripChart (different from stripchart) does add to the chart some values such as the n values. First I plotted my boxplot. Then I used the add=T in the stripChart. Obviously, many things were hidden in the stripChart code so that they do not show up on the boxplot. Here is the code I used for the stripChart to hide most items.

    Boxplot with integrated stripChart to show n values:

    stripChart(data.frame(T0_G1,T24h_G1,T96h_G1,T7d_G1,T11d_G1,T15d_G1,T30d_G1), show.ci=F,axes=F,points.cex=0,n.text.line=1.6,n.text.cex=0.7,add=T,location.scale.text="none")
    

    So boxplot

    boxplot(data.frame(T0_G1,T24h_G1,T96h_G1,T7d_G1,T11d_G1,T15d_G1,T30d_G1),main="All Rheometry Tests on Egg Plasma at All Time Points at 0.1Hz,0.1% and 37 Set 1,2,3", names=c("0h","24h","96h","7d ", "11d", "15d", "30d"),boxwex=0.6,par(mar=c(8,4,4,2)))
    

    Then stripChart

    stripChart(data.frame(T0_G1,T24h_G1,T96h_G1,T7d_G1,T11d_G1,T15d_G1,T30d_G1), show.ci=F,axes=F,points.cex=0,n.text.line=1.6,n.text.cex=0.7,add=T,location.scale.text="none")
    

    You can always adjust the high of the numbers (n values) so that they fit where you want.

    0 讨论(0)
  • 2021-02-01 22:49

    To get the n on top of the bar, you could use text with the stat details provided by boxplot as follows

    b <- boxplot(xvar ~ f1, data=frame, plot=0)
    text(1:length(b$n), b$stats[5,]+1, paste("n=", b$n))
    

    The stats field of b is a matrix, each column contains the extreme of the lower whisker, the lower hinge, the median, the upper hinge and the extreme of the upper whisker for one group/plot.

    0 讨论(0)
  • 2021-02-01 22:52

    The gplots package provides boxplot.n, which according to the documentation produces a boxplot annotated with the number of observations.

    0 讨论(0)
  • 2021-02-01 22:54

    You can use the names parameter to write the n next to each factor name.

    If you don't want to calculate the n yourself you could use this little trick:

    # Do the boxplot but do not show it
    b <- boxplot(xvar ~ f1, data=frame, plot=0)
    # Now b$n holds the counts for each factor, we're going to write them in names
    boxplot(xvar ~ f1, data=frame, xlab="input values", names=paste(b$names, "(n=", b$n, ")"))
    
    0 讨论(0)
提交回复
热议问题