How to add value labels on the flows item of a Alluvial/Sankey plot (on R ggalluvial)?

老子叫甜甜 提交于 2020-06-11 10:39:25

问题


I'm looking to label the "flow" portion of Alluvial / Sankey chart on R.

The stratums (columns) can easily be labelled, but not the flows connecting them. All my attempts on reading the documentations and experimenting were to no avail.

In the sample below, "freq" is expected to be labelled on the flow connection part.

chart

library(ggplot2)
library(ggalluvial)

data(vaccinations)
levels(vaccinations$response) <- rev(levels(vaccinations$response))
ggplot(vaccinations,
       aes(x = survey, stratum = response, alluvium = subject,
           y = freq,
           fill = response, label = freq)) +
  scale_x_discrete(expand = c(.1, .1)) +
  geom_flow() +
  geom_stratum(alpha = .5) +
  geom_text(stat = "stratum", size = 3) +
  theme(legend.position = "bottom") +
  ggtitle("vaccination survey responses at three points in time")

回答1:


There is an option to take the raw numbers and use these as labels for the flow part:

ggplot(vaccinations,
       aes(x = survey, stratum = response, alluvium = subject,
           y = freq,
           fill = response, label = freq)) +
  scale_x_discrete(expand = c(.1, .1)) +
  geom_flow() +
  geom_stratum(alpha = .5) +
  geom_text(stat = "stratum", size = 3) +
  geom_text(stat = "flow", nudge_x = 0.2) +
  theme(legend.position = "bottom") +
  ggtitle("vaccination survey responses at three points in time")

If you want more control over how to label these points, you can extract the layer data and do computations on that. For example we can compute the fractions for only the starting positions as follows:

# Assume 'g' is the previous plot object saved under a variable
newdat <- layer_data(g)
newdat <- newdat[newdat$side == "start", ]
split <- split(newdat, interaction(newdat$stratum, newdat$x))
split <- lapply(split, function(dat) {
  dat$label <- dat$label / sum(dat$label)
  dat
})
newdat <- do.call(rbind, split)

ggplot(vaccinations,
       aes(x = survey, stratum = response, alluvium = subject,
           y = freq,
           fill = response, label = freq)) +
  scale_x_discrete(expand = c(.1, .1)) +
  geom_flow() +
  geom_stratum(alpha = .5) +
  geom_text(stat = "stratum", size = 3) +
  geom_text(data = newdat, aes(x = xmin + 0.4, y = y, label = format(label, digits = 1)),
            inherit.aes = FALSE) +
  theme(legend.position = "bottom") +
  ggtitle("vaccination survey responses at three points in time")

It still is kind of a judgement call about where exactly you want to place the labels. Doing it at the start is the easy way, but if you want these labels to be approximately in the middle and dodging oneanother it would require some processing.



来源:https://stackoverflow.com/questions/57745314/how-to-add-value-labels-on-the-flows-item-of-a-alluvial-sankey-plot-on-r-ggallu

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!