问题
I'm creating a wordcloud in which the size of the words is based on frequency, but i want the colour of the words to be mapped to a third variable (stress, which is the amount of stress associated with each word, a numerical or continuous variable).
I tried the following, which gave me only two different colours (yellow and purple) while i want something more smooth. I would like some color range like a palette that goes from green to red for example.
df = data.frame(word = c("calling", "meeting", "conference", "contract", "negotiation", "email"),
n = c(20, 12, 4, 8, 10, 43),
stress = c(23, 30, 15, 40, 35, 15))
df = tbl_df(df)
wordcloud(words = df$word, freq = df$n, col = df$stress)
Does anyone know how to deal with this continous metadata and get some smoothly changing colour for the words when stress goes up? Thanks!
回答1:
Here is a potential solution. You want to use the wordcloud2
package for your task. Then, you can solve your issue, I suppose. Since I do not know your real data, I created a sample data to demonstrate a prototype.
If you have many words, I am not sure if adding colors with a continuous variable (stress) is a good idea. One thing you could do is to create a new group variable using cut()
. In this way, you can reduce the numbers of colors you would use in your graphics. Here, I created a new column called color
with five colors from the viridis package.
When you use wordcloud2()
, you have only two things to supply. One is data and the other is color. Font size reflects frequency of the words without specifying it.
mydf = data.frame(word = c("calling", "meeting", "conference", "contract", "negotiation",
"email", "friends", "chat", "text", "deal",
"business", "promotion", "discount", "users", "family"),
n = c(20, 12, 4, 8, 10, 43, 33, 5, 47, 28, 12, 9, 50, 31, 22),
stress = c(23, 30, 15, 40, 35, 15, 30, 18, 10, 5, 29, 38, 45, 8, 3))
word n stress
1 calling 20 23
2 meeting 12 30
3 conference 4 15
4 contract 8 40
5 negotiation 10 35
6 email 43 15
7 friends 33 30
8 chat 5 18
9 text 47 10
10 deal 28 5
11 business 12 29
12 promotion 9 38
13 discount 50 45
14 users 31 8
15 family 22 3
library(dplyr)
library(wordcloud2)
library(viridis)
mutate(mydf, color = cut(stress, breaks = c(0, 10, 20, 30, 40, Inf),
labels = c("#FDE725FF", "#73D055FF", "#1F968BFF",
"#2D708EFF", "#481567FF"),
include.lowest = TRUE)) -> temp
wordcloud2(data = temp, color = temp$color)
回答2:
Or something a bit more automatic instead of specifying the exact threshold values and colors:
library(RColorBrewer)
library(wordcloud2)
mydf = data.frame(word = c("calling", "meeting", "conference", "contract", "negotiation",
"email", "friends", "chat", "text", "deal",
"business", "promotion", "discount", "users", "family"),
n = c(20, 12, 4, 8, 10, 43, 33, 5, 47, 28, 12, 9, 50, 31, 22),
stress = c(23, 30, 15, 40, 35, 15, 30, 18, 10, 5, 29, 38, 45, 8, 3))
color_range_number <- length(unique(mydf$stress))
color <- colorRampPalette(brewer.pal(9,"Blues")[3:7])(color_range_number)[factor(mydf$stress)]
wordcloud2(mydf, color=color)
So that the size is determined by 'n', and the shade of color determined by 'stress'.
[3:7] is for adjusting the color scale range. 1 is the lightest and 9 is the darkest.
You may check the other color palette options by:
display.brewer.all()
来源:https://stackoverflow.com/questions/43894416/wordcloud-showing-colour-based-on-continous-metadata-in-r