How do I manually set geom_bar fill color in ggplot

前端 未结 2 2043
清歌不尽
清歌不尽 2021-01-04 10:27

I\'m trying to create several graphs using ggplot. The graphs are a series of bar graphs that together describe a line as well EXAMPLE (BTW, yes I realize the color palett

相关标签:
2条回答
  • 2021-01-04 11:01

    I'm guessing that you've looked at the ggplot color blind example shown here? Without your data, I can only speculate that your geom_bar calls create ambiguity regarding which layer to apply the fill changes to since your initial call to ggplot doesn't have an aes argument. Try moving all of your data into a single dataframe and reference it in the initial call to ggplot, e.g.,

    ggplot(df, aes(x=cond, y=yval)) +
        geom_bar() + 
        scale_fill_manual(values=cbbPalette)
    

    where df is the dataframe containing your data and aes is the mapping between your variables. This makes it clear to ggplot that you want the fill colors of geom_bar to correspond to the data in df. There are ways to make this work with your current code, but they're unconventional for creating standard bar plots.

    0 讨论(0)
  • 2021-01-04 11:24

    The answer of Jay B. Martin doesn't fully answer the question. So although this question is quite old, here is a solution for future reference. We make some data for a reproducible example:

    color_table <- tibble(
      Land_cover = c("Agriculture", "Forest", "Ocean", "Lake", "Populated"),
      Color = c("yellow", "darkgreen", "blue4", "lightblue", "maroon3")
      )
    
    df <- data.frame(
      Region = c(rep(1,5), rep(2,5)),
      Area_no = c(1,2,3,4,5,1,2,3,4,5),
      Land_cover = c("Agriculture", "Forest", "Agriculture", "Agriculture", "Lake", 
                     "Lake", "Populated", "Populated", "Ocean", "Populated"), 
      Square_km = c(10,15,7,12,3, 5,30,20,40,10)
      )
    

    So, we want to use df to make a graph for each Region, where Land_cover is represented by the correct color given by color_table. First, we must make sure that the Land_cover variable in the data set df is a a factor variable in the same order as the colors we want to put on each type of land cover. We do that by using the order from color_table:

    df$Land_cover <- factor(df$Land_cover, levels = color_table$Land_cover)
    

    Now, the by far simplest way to plot using the correct colors is, as Jay B. Martin suggests in the comments, to use facet_grid() or facet_wrap():

    ggplot(df, aes(x = Area_no, y = Square_km, fill = Land_cover)) +
      geom_col() +
      scale_fill_manual(values = color_table$Color) +
      facet_grid(.~Region) 
    

    But what if you want to make a separate plot for each Region? For instance, you want to save each plot as a separate file.

    The problem

    If we basically make a small loop where we select a subset of the data and reuse the code we used above (except facet_grid), we clearly get the wrong colours (shown here for Region 2):

    for (region in 1:2){
      gg <- ggplot(subset(df, Region %in% region), aes(x = Area_no, y = Square_km, fill = 
      Land_cover)) +
        geom_col() + 
        scale_fill_manual(values = color_table$Color) 
      ggsave(paste0("Areas_region_", region, ".png"), width = 5, height = 3)
      }
    

    There are two ways to get the correct colours:

    Solution 1. drop = FALSE (legend shows all categories)

    Adding drop = FALSE inside scale_fill_manual is by far the simplest. You will then get the corrcet colours, and the legend will show all possible categories, not only those that are in the plot:

    for (region in 1:2){
      gg <- ggplot(subset(df, Region %in% region), aes(x = Area_no, y = Square_km, fill = 
      Land_cover)) +
        geom_col() + 
        scale_fill_manual(values = color_table$Color, drop = FALSE) 
      ggsave(paste0("Areas_region_", region, ".png"), width = 5, height = 3)
      }
    

    Solution 2. Pick colors for each plot (legend shows only the categories shown in plot)

    If for some reason you don't want the legend to show all possible categories (for instance if there is a huge number of them), you need to pick the correct colors for each plot:

    library(magrittr)
    for (region in 1:2){
      df_plot <- subset(df, Region %in% region)
      actual_cover <- df_plot$Land_cover %>% as.numeric() %>% table() %>% names() %>% as.numeric()
      gg <- ggplot(df_plot, aes(x = Area_no, y = Square_km, fill = Land_cover)) +
        geom_col() + 
        scale_fill_manual(values = color_table$Color[actual_cover])
      ggsave(paste0("Areas_region_", region, "ver3.png"), width = 5, height = 3)
      }
    

    which results in the following plot (for Region 2):

    What we actually do here is to make a vector actual_cover which contains which colours (number 1-6) that are actually used in the current plot. As a result, the legend contains only the categories present in the plot, while the colours are still correct.

    0 讨论(0)
提交回复
热议问题