Creating Stacked Bar Chart With one Variable for each Bar, using melt, and ggplot

大憨熊 提交于 2019-12-13 08:57:30

问题


This question is raising different points as the one I posted yesterday, with a better description, so I hope for your understanding. I have the following Data:

Data <- data.frame(LMX = c(1.92, 2.33, 3.52, 5.34, 6.07, 4.23, 3.45, 5.64), Thriving = c(4.33, 6.54, 6.13, 4.85, 4.26, 6.32, 5.63, 4.55), Wellbeing = c(1.92, 2.33, 3.52, 2.34, 4.07, 3.23, 3.45, 4.64))
rownames(Data) <- 1:8

Now, my aim is to generate a flipped over bar chart that is showing one bar for each variable with all bars summing up to 100% and being divided according to the values - yellow for all values from 0 to 1.99, orange for all values from 2 to 3.99, red for all values from 4 to 5.99 and green for all values from 6 to 7. More precisely, I am looking for something like this.:

Now, I tried the following code:

Data_A <- melt(cbind(Data, ind = rownames(Data)), id.vars = c('ind'))

ggplot(Data_A, aes(x = variable, y = value, fill = factor(value))) + 
geom_bar(position = "fill", stat = "identity") + 
scale_y_continuous(labels = percent_format())  + 
coord_flip()

Unfortunately, I have no idea how to group the values in those categories I mentioned above. What is more, using this code the values are not even arranged in the right order, from low to high.

Could you please give me some recommendations how to get a picture as shown above?

Also, there is one further problem: each of those 8 individuals belongs to one of two groups and I would like to distinguish the values in the light of those two groups. However, including this additional variable to my code would just melt it together with the other variables. So I don't see any way to account for the groups here as well, using for instance facet_grid() to add the group-identifier. Do you have a suggestion here as well? Should I maybe use an entirely different approach/code?


回答1:


Is this what you're looking for regarding the first part? (I advise you change colors to prevent epileptic seizures.)

Data %>%
  mutate_all(cut, c(0, 2, 4, 6, 7), right = F, ) %>% 
  gather(key = "variable", value= "value") %>% 
  ggplot(aes(x = variable, fill = value)) + 
  geom_bar(position = position_fill(reverse = TRUE)) +
  coord_flip() +
  scale_fill_manual(values=c("yellow", "orange", "red", "green"))

For the second part, a reproducible example would be useful but you can probably add a "group" variable (between gather and ggplot) and use facet_grid or facet_wrap.

--- Edited below after information about groups ---

Column selection is missing in DataG[Data_IlA$G1_ID == 2] and variable names are not the same as the one in DataG so DataG_1 cannot be created.

Does one of the suggestions below make the figure you want?

DataG %>%
  gather(key = "variable", value = "value", -Group_ID) %>%
  mutate(value = cut(value, c(0, 1.99, 3.99, 5.99, 7))) %>%
  ggplot(aes(x = variable, fill = value)) +
  geom_bar(position = position_fill(reverse = TRUE)) +
  scale_y_continuous(labels = scales::percent) +
  coord_flip() +
  scale_fill_manual(values=c("#19557E","#6E3B60", "#EA916A", "#EFC76C")) +
  theme(panel.background = element_blank()) +
  xlab("") + ylab("") +
  facet_grid(Group_ID ~ .)

DataG %>%
  gather(key = "variable", value = "value", -Group_ID) %>%
  mutate(value = cut(value, c(0, 1.99, 3.99, 5.99, 7))) %>%
  ggplot(aes(x = Group_ID, fill = value)) +
  geom_bar(position = position_fill(reverse = TRUE)) +
  scale_x_discrete(limits = c("Group 1","Group 2")) +
  scale_y_continuous(labels = scales::percent) +
  coord_flip() +
  scale_fill_manual(values=c("#19557E","#6E3B60", "#EA916A", "#EFC76C")) +
  theme(panel.background = element_blank()) +
  xlab("") + ylab("") +
  facet_grid(variable ~ .)

--- Edited below after comment on groups ---

If you need to change categories for any variable, the easiest way may be to do so before calling ggplot:

DataG %>%
  mutate(Group_ID = case_when(
    Group_ID == 1 ~ "1st group's name",
    Group_ID == 2 ~ "2nd group's name"
  )) %>% 
  gather(key = "variable", value = "value", -Group_ID) %>%
  mutate(value = cut(value, c(0, 1.99, 3.99, 5.99, 7))) %>%
  ggplot(aes(x = variable, fill = value)) +
  geom_bar(position = position_fill(reverse = TRUE)) +
  scale_y_continuous(labels = scales::percent) +
  coord_flip() +
  scale_fill_manual(values=c("#19557E","#6E3B60", "#EA916A", "#EFC76C")) +
  theme(panel.background = element_blank()) +
  xlab("") + ylab("") +
  facet_grid(Group_ID ~ .)



回答2:


You are OK up to the melt. Does this do what you are after?

ggplot(Data_A, aes(x = variable, y = value, fill = cut(value,breaks = c(0,2,4,6,7)))) + 
  geom_bar(position = "fill", stat = "identity") + 
  scale_y_continuous(labels = percent_format())  +
  scale_fill_manual(name="answer",values=c("yellow","orange","red","green")) +
  coord_flip()




回答3:


In order to group multiple numeric fills you have to use cut() function. It will group the numbers into your desired values from -Inf to +Inf. Then these groups can be colored specifically using scale_fill_manual().

Use this code:

ggplot(Data_A, aes(x = variable, y = value)) +
  scale_y_continuous(labels = percent_format())+coord_flip()+ 
  geom_bar(position = "fill", stat = "identity",aes(fill=cut(value,c(0,2,4,6,7))))+
  scale_fill_manual(values=c("#F8F668","#F8BA5B","#F66053","#82F653"))+
  labs(fill="")+theme(panel.background = element_blank())

The output of this plot is provided below:

Hope this helps!!




回答4:


thanks to the very helpful answers, I was able to put together the following code to answer the first question I originally asked:

DataG <- data.frame(LMX = c(1.92, 2.33, 3.52, 5.34, 6.07, 4.23, 3.45, 5.64), Thriving = c(4.33, 6.54, 6.13, 4.85, 4.26, 6.32, 5.63, 4.55), Wellbeing = c(1.92, 2.33, 3.52, 2.34, 4.07, 3.23, 3.45, 4.64) , Group_ID = c(1, 2, 1, 2, 2, 2, 1, 1))
rownames <- 1:8


DataG[Data_IlA$G1_ID == 2] %>%
  select("Leader-Member-Exchange" = LMX, "Thriving" = Thriving, "Wellbeing" = Wellbeing) %>% 
  na.omit -> DataG_1

DataG_1 %>%
  mutate_all(cut, c(0, 1.99, 3.99, 5.99, 7) ) %>%
  gather(key = "variable", value = "value") %>%
  ggplot(aes(x = variable, fill = value)) +
  geom_bar(position = position_fill(reverse = TRUE)) +
  scale_y_continuous(labels = percent_format()) +
  coord_flip() +
  scale_fill_manual(values=c("#19557E","#6E3B60", "#EA916A", "#EFC76C")) +
  theme(panel.background = element_blank())

Now, concerning the second question I originally raised: as you can see in the source-data (DataG) above, I was adding another variable, G1_ID, which is a group identifier - every respondent belongs to one of two groups. I would like to show separate bar graphs for the values for each group. As you can see in the code, I was adding "[Data_IlA$G1_ID == 2]" behind the source-data DataG in order to have R only consider the values which belong to observations that belong to group 2. However, this addition to the code does not change anything at all. Why is that? What other code could I use to distinguish between the two groups? Should I resort to Facet_grid() instead?

Thank you so much for your comments,

Andreas



来源:https://stackoverflow.com/questions/50516487/creating-stacked-bar-chart-with-one-variable-for-each-bar-using-melt-and-ggplo

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!