Order of factor levels changes when plotting layers with data subsets

前端 未结 2 1795
滥情空心
滥情空心 2021-02-14 02:26

I am trying to control the order of items in a legend in a ggplot2 plot in R. I looked up some other similar questions and found out about changing the order of the

相关标签:
2条回答
  • 2021-02-14 02:46

    One possibility is to add a geom_blank as a first layer in the plot. From ?geom_blank: "The blank geom draws nothing, but can be a useful way of ensuring common scales between different plots.". We tell the geom_blank layer to use the entire data set. This layer thus sets up a scale which includes all levels of 'Month', correctly ordered. Then add the two layers of geom_pointrange, which each uses a subset of the data.

    Perhaps a matter of taste in this particular case, but I tend to prefer to prepare the data sets before I use them in ggplot.

    df_sum <- testdata[testdata$Month %in% c("June", "July"), ]
    df_win <- testdata[testdata$Month %in% c("December", "January"), ]
    
    ggplot(data = testdata, aes(x = hour, y = avg_hou, ymin = lower_ci, ymax = upper_ci,
           color = Month, shape = Month)) +
      geom_blank() +
      geom_pointrange(data = df_sum, size = 1, position = position_dodge(width = 0.3)) +
      geom_pointrange(data = df_win, size = 1, position = position_dodge(width = 0.6)) +
      scale_color_manual(values = c("June" = "#FDB863", "July" = "#E66101",
                         "December" = "#92C5DE", "January" = "#0571B0"))
    

    enter image description here

    0 讨论(0)
  • 2021-02-14 03:02

    Another way to think about "dodge" is as an offset from the x-values based on group (in this case Month). So if we add a dodge (x-offset) column to your original data, based on month:

    # your original sample data
    # note the use of set.seed(...) so "random" data is reproducible
    set.seed(1)
    hour     <- rep(seq(from=1,to=24,by=1),4)
    avg_hou  <- sample(seq(0,0.5,0.001),96,replace=TRUE)
    lower_ci <- avg_hou - sample(seq(0,0.05,0.001),96,replace=TRUE)
    upper_ci <- avg_hou + sample(seq(0,0.05,0.001),96,replace=TRUE)
    Month    <- c(rep("December",24), rep("January",24), rep("June",24), rep("July",24))
    testdata       <- data.frame(Month,hour,avg_hou,lower_ci,upper_ci)
    testdata$Month <- factor(testdata$Month,levels=c("June", "July", "December","January"))
    
    # add offset column for dodge
    testdata$dodge <- -2.5+(as.integer(testdata$Month))
    
    # create ggplot object and default mappings
    ggp <- ggplot(testdata, aes(x=hour, y = avg_hou, ymin = lower_ci, ymax = upper_ci, color = Month, shape = Month))
    ggp <- ggp + scale_color_manual(values = c("June" = "#FDB863", "July" = "#E66101", "December" = "#92C5DE", "January" = "#0571B0"))
    
    # plot the point range
    ggp + geom_pointrange(aes(x=hour+0.2*dodge), size=1)
    

    Produces this:

    This does not require geom_blank(...) to maintain the scale order, and it does not require two calls to geom_pointrange(...)

    0 讨论(0)
提交回复
热议问题