How to extend ggplot2 boxplot with ggproto?

后端 未结 1 594
星月不相逢
星月不相逢 2020-12-29 14:42

I\'m often using boxplots in my work and like ggplot2 aesthetics. But standard geom_boxplot lacks two things important for me: ends of whiskers and

相关标签:
1条回答
  • 2020-12-29 15:06

    So been thinking about this one for a while. Basically when you create a new primitive, you normally write a combination of:

    1. A layer-function
    2. A stat-ggproto,
    3. A geom-ggproto

    Only the layer-function need be visible to the user. You only need to write a stat-ggproto if you need some new way of transforming your data to make your primitive. And you only need write a geom-ggproto if you have some new grid-based graphics to create.

    In this case, where we are basically composting layer-function that already exist, we don’t really need to write new ggprotos. It is enough to write a new layer-function. This layer-function will create the three layers that you already are using and map the parameters the way you intend. In this case:

    • Layer1 – uses geom_errorbar and stat_boxplot – to get our errorbars
    • Layer2 – uses geom_boxplot and stat_boxplot - to create the boxplots
    • Layer3 – users geom_label and stat_summary - to create the text labels with the mean value in the center of the boxes.

    Of course you could write a new stat-ggproto and a new geom-ggproto that do all of these things at once. Or maybe you compost stat_summary and stat_boxplot into one, and the three geom-protos as well, and this do this with one layer. But there is little point unless we have efficiency problems.

    Anyway, here is the code:

    geom_myboxplot <- function(formula = NULL, data = NULL,
                               stat = "boxplot", position = "dodge",coef=1.5,
                               font = "sans", fsize = 18, width=0.6,
                               fun.data = NULL, fun.y = NULL, fun.ymax = NULL,
                               fun.ymin = NULL, fun.args = list(),
                               outlier.colour = NULL, outlier.color = NULL,
                               outlier.shape = 19, outlier.size = 1.5,outlier.stroke = 0.5,
                               notch = FALSE,  notchwidth = 0.5,varwidth = FALSE,
                               na.rm = FALSE, show.legend = NA,
                               inherit.aes = TRUE,...) {
        vars <- all.vars(formula)
        response <- vars[1]
        factor <- vars[2]
        mymap <- aes_string(x=factor,y=response)
        fun_med <- function(x) {
            return(data.frame(y = median(x), label = round(median(x), 3)))
        }
        position <- position_dodge(width)
        l1 <- layer(data = data, mapping = mymap, stat = StatBoxplot,
                geom = "errorbar", position = position, show.legend = show.legend,
                inherit.aes = inherit.aes, params = list(na.rm = na.rm,
                    coef = coef, width = width, ...))
        l2 <- layer(data = data, mapping = mymap, stat = stat, geom = GeomBoxplot,
                position = position, show.legend = show.legend, inherit.aes = inherit.aes,
                params = list(outlier.colour = outlier.colour, outlier.shape = outlier.shape,
                    outlier.size = outlier.size, outlier.stroke = outlier.stroke,
                    notch = notch, notchwidth = notchwidth, varwidth = varwidth,
                    na.rm = na.rm, ...))
        l3 <- layer(data = data, mapping = mymap, stat = StatSummary,
                geom = "label", position = position, show.legend = show.legend,
                inherit.aes = inherit.aes, params = list(fun.data = fun_med,
                    fun.y = fun.y, fun.ymax = fun.ymax, fun.ymin = fun.ymin,
                    fun.args = fun.args, na.rm=na.rm,family=font,size=fsize/3,vjust=-0.1,...))
        return(list(l1,l2,l3))
    }
    

    which allows you to create your customized boxplots it now like this:

    ggplot(mpg) +
      geom_myboxplot( hwy ~ class, font = "sans",fsize = 18)+
      theme_grey(base_family = "sans",base_size = 18 )
    

    And they look like this:

    Note: we did not actually have to use the layer function, we could have used the orginal stat_boxplot, geom_boxplot, and stat_summary calls in their place. But we still would have had to fill in all the parameters if we wanted to be able to control them from our custom boxplot, so I think it was clearer this way - at least from the point-of-view of structure as opposed to functionality. Maybe it isn't though, it is a matter of taste...

    Also I don't have that font which does look a lot nicer. But I did not feel like tracking it down and installing it.

    0 讨论(0)
提交回复
热议问题