Plotting in ggplot using cumsum

心已入冬 提交于 2020-06-17 14:18:12

问题


I am trying to use ggplot2 to plot a date column vs. a numeric column.

I have a dataframe that I am trying to manipulate with country as either china or not china, and successfully created the dataframe linked below with:

is_china <- confirmed_cases_worldwide %>%
  filter(country == "China", type=='confirmed') %>%
  group_by(country) %>%
  mutate(cumu_cases = cumsum(cases)) 

is_not_china <- confirmed_cases_worldwide %>%
  filter(country != "China", type=='confirmed') %>%
  mutate(cumu_cases = cumsum(cases))

is_not_china$country <- "Not China"

china_vs_world <- rbind(is_china,is_not_china)

Now essentially I am trying to plot a line graph with cumu_cases and date between "china" and "not china" I am trying to execute this code:

plt_china_vs_world <- ggplot(china_vs_world) +
  geom_line(aes(x=date,y=cumu_cases,group=country,color=country)) +
  ylab("Cumulative confirmed cases") 

Now I keep getting a graph looking like this:

Don't understand why this is happening, been trying to convert data types and other methods. Any help is appreciated, I linked both csv below

https://github.com/king-sules/Covid


回答1:


The 'date' for other 'country' are repeated because the 'country' is now changed to 'Not China'. It would be either changed in the OP's 'is_not_china' step or do this in 'china_vs_world'

library(ggplot2)
library(dplyr)
china_vs_world %>%
   group_by(country, date) %>%
   summarise(cumu_cases = sum(cases)) %>% 
   ungroup %>% 
   mutate(cumu_cases = cumsum(cumu_cases)) %>%
   ggplot() +  
    geom_line(aes(x=date,y=cumu_cases,group=country,color=country)) + 
       ylab("Cumulative confirmed cases") 

-output

NOTE: It is the scale that shows the China numbers to be small.

As @Edward mentioned a log scale would make it more easier to understand

china_vs_world %>%
   group_by(country, date) %>%
   summarise(cumu_cases = sum(cases)) %>% 
   ungroup %>% 
   mutate(cumu_cases = cumsum(cumu_cases)) %>%
   ggplot() +  
    geom_line(aes(x=date,y=cumu_cases,group=country,color=country)) + 
       ylab("Cumulative confirmed cases") +     
    scale_y_continuous(trans='log')

Or with a facet_wrap

china_vs_world %>% 
   group_by(country, date) %>%
   summarise(cumu_cases = sum(cases)) %>% 
   ungroup %>%
   mutate(cumu_cases = cumsum(cumu_cases)) %>%      
  ggplot() +  
    geom_line(aes(x=date,y=cumu_cases,group=country,color=country)) + 
      ylab("Cumulative confirmed cases") +
    facet_wrap(~ country, scales = 'free_y')

data

china_vs_world <- read.csv("https://raw.githubusercontent.com/king-sules/Covid/master/china_vs_world.csv", stringsAsFactors = FALSE)
china_vs_world$date <- as.Date(china_vs_world$date)


来源:https://stackoverflow.com/questions/61993479/plotting-in-ggplot-using-cumsum

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!