I\'m trying to create several graphs using ggplot. The graphs are a series of bar graphs that together describe a line as well EXAMPLE (BTW, yes I realize the color palett
The answer of Jay B. Martin doesn't fully answer the question. So although this question is quite old, here is a solution for future reference. We make some data for a reproducible example:
color_table <- tibble(
Land_cover = c("Agriculture", "Forest", "Ocean", "Lake", "Populated"),
Color = c("yellow", "darkgreen", "blue4", "lightblue", "maroon3")
)
df <- data.frame(
Region = c(rep(1,5), rep(2,5)),
Area_no = c(1,2,3,4,5,1,2,3,4,5),
Land_cover = c("Agriculture", "Forest", "Agriculture", "Agriculture", "Lake",
"Lake", "Populated", "Populated", "Ocean", "Populated"),
Square_km = c(10,15,7,12,3, 5,30,20,40,10)
)
So, we want to use df
to make a graph for each Region
, where Land_cover
is represented by the correct color given by color_table
. First, we must make sure that the Land_cover
variable in the data set df
is a a factor variable in the same order as the colors we want to put on each type of land cover. We do that by using the order from color_table
:
df$Land_cover <- factor(df$Land_cover, levels = color_table$Land_cover)
Now, the by far simplest way to plot using the correct colors is, as Jay B. Martin suggests in the comments, to use facet_grid() or facet_wrap():
ggplot(df, aes(x = Area_no, y = Square_km, fill = Land_cover)) +
geom_col() +
scale_fill_manual(values = color_table$Color) +
facet_grid(.~Region)
But what if you want to make a separate plot for each Region? For instance, you want to save each plot as a separate file.
If we basically make a small loop where we select a subset of the data and reuse the code we used above (except facet_grid
), we clearly get the wrong colours (shown here for Region 2):
for (region in 1:2){
gg <- ggplot(subset(df, Region %in% region), aes(x = Area_no, y = Square_km, fill =
Land_cover)) +
geom_col() +
scale_fill_manual(values = color_table$Color)
ggsave(paste0("Areas_region_", region, ".png"), width = 5, height = 3)
}
There are two ways to get the correct colours:
Adding drop = FALSE
inside scale_fill_manual
is by far the simplest. You will then get the corrcet colours, and the legend will show all possible categories, not only those that are in the plot:
for (region in 1:2){
gg <- ggplot(subset(df, Region %in% region), aes(x = Area_no, y = Square_km, fill =
Land_cover)) +
geom_col() +
scale_fill_manual(values = color_table$Color, drop = FALSE)
ggsave(paste0("Areas_region_", region, ".png"), width = 5, height = 3)
}
If for some reason you don't want the legend to show all possible categories (for instance if there is a huge number of them), you need to pick the correct colors for each plot:
library(magrittr)
for (region in 1:2){
df_plot <- subset(df, Region %in% region)
actual_cover <- df_plot$Land_cover %>% as.numeric() %>% table() %>% names() %>% as.numeric()
gg <- ggplot(df_plot, aes(x = Area_no, y = Square_km, fill = Land_cover)) +
geom_col() +
scale_fill_manual(values = color_table$Color[actual_cover])
ggsave(paste0("Areas_region_", region, "ver3.png"), width = 5, height = 3)
}
which results in the following plot (for Region 2):
What we actually do here is to make a vector actual_cover
which contains which colours (number 1-6) that are actually used in the current plot. As a result, the legend contains only the categories present in the plot, while the colours are still correct.