mapping (ordered) factors to colors in ggplot

有些话、适合烂在心里 提交于 2021-02-20 04:05:55

问题


Consider this example

data_frame(mylabel = c('month 18',
                       'month 19',
                       'month 20',
                       'month 21',
                       'month 22'),
           value = c(5,10,-2,2,0),
           time = c(1,2,3,4,5)) %>% 
  ggplot(aes( x= time, y = value, color = mylabel)) +
  geom_point(size = 7)

Here you can see that the variable mylabel has a natural ordering: month 18 comes before month 19 etc.

However, this natural ordering is not preserved by the colors chosen by ggplot. In my real dataset, I have about 50 different months and I would like to use a color scale that makes this increase more intuitive (say from cold to hot).

How can I do that? Thanks!


回答1:


you can use the viridis color scale or another one which is better colored to indicate the order.

There are several options included for similar color scales (option = "A" through "D"). Change the order by direction = -1

I've added a step to get better ordering, in case months are listed incorrectly. It works, but I'm sure there's a simpler way. Pull out the month# from the name (has to be converted from char to numeric) and then factor it which will use the proper order.

library(tidyverse)
data_frame(mylabel = paste("month", 1:10),
             value = rnorm(length(mylabel)),
             time = seq_along(mylabel)) %>% 
    mutate(month_number = factor(as.numeric(gsub("month ([0-9]+)", "\\1", mylabel)))) %>% 
  ggplot(aes( x= time, y = value, color = month_number)) +
  geom_point(size = 7) +
  scale_color_viridis_d(option = "B", direction = -1)

Created on 2018-11-30 by the reprex package (v0.2.1)




回答2:


The as_factor function in forcats orders levels as they occur, rather than first putting all that start with "1", then all that start with "2", etc. This dodges the problem with having months 1 through 12.

I made up different data just to get the full set of month labels.

library(dplyr)
library(ggplot2)

set.seed(1234)
df <- data_frame(mylabel = paste("month", 1:12),
                 value = rnorm(12),
                 time = 1:12)

df_fact <- df %>%
  mutate(mylabel = forcats::as_factor(mylabel))

levels(df_fact$mylabel)
#>  [1] "month 1"  "month 2"  "month 3"  "month 4"  "month 5"  "month 6" 
#>  [7] "month 7"  "month 8"  "month 9"  "month 10" "month 11" "month 12"

ggplot(df_fact, aes(x = time, y = value, color = mylabel)) +
  geom_point(size = 7)

You can further adjust the color scale for one that better suits sequential data. I often use Color Brewer ones, but also like some of the rcartocolor scales. In this case, having 12 levels will max out the number of colors available in a lot of sequential palettes, although the Viridis scales that ship with ggplot2 (e.g. scale_color_viridis_d) will interpolate to fit this many levels.

Created on 2018-11-30 by the reprex package (v0.2.1)



来源:https://stackoverflow.com/questions/53560099/mapping-ordered-factors-to-colors-in-ggplot

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!