问题
As someone new to R, I am working at producing a word cloud that shows two variables: frequency
and rating
. Using a generic table, I am looking to display the hypothetical number of colleges (font = big to small in number) by state and the hypothetical average college rating
- 1 = green (good),
- 3 = yellow (average),
- 5 = red (bad)
I am able to to create this cloud that depicts fonts = number of colleges, but cannot tie in the rating to the third column. Here is my generic table:
State Colleges Rating
Alabama 220 1
Alaska 100 3
Arizona 50 5
Arkansas 275 1
California 155 3
Colorado 68 5
Connecticut 235 1
Delaware 189 3
Florida 32 5
Georgia 219 1
Hawaii 117 3
Idaho 63 5
Illinois 264 1
Indiana 167 3
Iowa 76 5
Kansas 287 1
Kentucky 178 3
Louisiana 67 5
Maine 246 1
Maryland 169 3
Massachusetts 46 5
Michigan 225 1
Minnesota 132 3
Mississippi 23 5
Missouri 219 1
Montana 194 3
Nebraska 97 5
Below is my very simple script:
library(wordcloud)
library(rcolorbrewer)
data <- read.csv("wordcloud.csv", header = T)
pal <- brewer.pal(9, "RdYlGn")
wordcloud(data$State, data$Colleges, scale = c(4,1), colors = pal, rot.per=.5)
The above script allows for text size to reflect number of colleges, but I am not able to link the color ramp of 1 = green (good) to 3 = yellow (average) to 5 = red (bad). Any suggestions are greatly appreciated.
回答1:
There's also the possibility to plot a comparison cloud in such cases.
For this, we first convert the data from long to wide format:
library(reshape2)
df1 <- dcast(df1,State + Colleges ~ Rating, value.var = "Colleges")
Then we perform a few standard operations to prepare a suitable matrix:
rownames(df1) <- df1[,1] #use name of States as row names
df1 <- df1[,-c(1,2)] #remove "States" and "Colleges" column
df1[is.na(df1)] <- 0 #set NA values to zero
df1 <- as.matrix(df1) #convert into matrix
colnames(df1) <- c("good", "average", "bad")
Finally, we can plot the comparison cloud and assign colors to the groups as we wish:
library(wordcloud)
comparison.cloud(df1,max.words=Inf,random.order=FALSE, scale = c(4,.5),
title.size = 1, colors=c("green","orange","red"))
data
df1 <- structure(list(State = structure(1:27, .Label = c("Alabama",
"Alaska", "Arizona", "Arkansas", "California", "Colorado", "Connecticut",
"Delaware", "Florida", "Georgia", "Hawaii", "Idaho", "Illinois",
"Indiana", "Iowa", "Kansas", "Kentucky", "Louisiana", "Maine",
"Maryland", "Massachusetts", "Michigan", "Minnesota", "Mississippi",
"Missouri", "Montana", "Nebraska"), class = "factor"), Colleges = c(220L,
100L, 50L, 275L, 155L, 68L, 235L, 189L, 32L, 219L, 117L, 63L,
264L, 167L, 76L, 287L, 178L, 67L, 246L, 169L, 46L, 225L, 132L,
23L, 219L, 194L, 97L), Rating = c(1L, 3L, 5L, 1L, 3L, 5L, 1L,
3L, 5L, 1L, 3L, 5L, 1L, 3L, 5L, 1L, 3L, 5L, 1L, 3L, 5L, 1L, 3L,
5L, 1L, 3L, 5L)), .Names = c("State", "Colleges", "Rating"),
class = "data.frame", row.names = c(NA, -27L))
回答2:
You can assign the colours manually and add ordered.colors=T
wordcloud(data$State, data$Colleges,
scale = c(4,1),
colors = rep(c("green", "yellow", "red"), 9),
rot.per=.5,
ordered.colors=T)
来源:https://stackoverflow.com/questions/36048999/word-cloud-in-r-with-two-separate-values