Filter a piped df within ggplot

末鹿安然 提交于 2021-02-11 13:53:29

问题


I am using a dplyr pipeline to clean my df then feed directly into a ggplot. However, I want to plot only one group at a time, so I need to filter to just that group. The problem is, I want the scales to remain constant as if all groups are present. Is it possible to further filter a piped df inside the ggplot() commands? Ex below.

# create df
set.seed(1)
df <- data.frame(matrix(nrow=100,ncol=5)) 
colnames(df) <- c("year","group","var1","var2","var3") 
df$year <- rep(1:4,each=25)
df$group <- rep(c("a","b","c","d","e"),times=20)
df$var1 <- runif(100,min=0,max=30)
df$var2 <- sample(1:500,100,replace=T) 
df$var2[1:25] <- sample(1:100,25,replace = T)
df$var3 <- runif(100,min=0,max=100)

Now pipe it to clean it (here we're just doing some random stuff to it), then plot:

df %>%
  filter(var3 < 80) %>%   # random thing 1 - filter some stuff
  filter(var2 < 400) %>%   # random thing 2 - filter more
  mutate(var2 = as.numeric(var2)) %>%  # random thing 3 - mutate a column
  ggplot(aes(x=group,y=var1,color=var2)) + 
  geom_point()

So I want to only plot one year at a time (from the "year" column), but I want to do it in a way in which I can plot each year in a loop, but keep the colorbar scaled to the full df values.

Here's what I tried so far :

dlist <- c(1:4)   #list of years
i <- 2    #current year

df %>%
  filter(var3 < 80) %>%
  filter(var2 != 56) %>%
  mutate(var2 = as.numeric(var2)) %>%
  filter(year %in% dlist[i]) %>%   # so I can filter for year here, but that makes the colorbar in the ggplot scale for this subset individually, which is no good. 
  ggplot(aes(x=group,y=var1,color=var2)) + 
  geom_point()

I think there should be a way to use . and %>% within the ggplot parentheses so that the scale remains... but I can't quite figure it out.

dlist <- c(1:4)   #list of years
i <- 2    #current year

df %>%
  filter(var3 < 80) %>%
  filter(var2 != 56) %>%
  mutate(var2 = as.numeric(var2)) %>%
  ggplot(data = .%>%filter(year %in% dlist[i]), aes(x=group,y=var1,color=var2)) + 
  geom_point()

but that gives me this error:

Error: You're passing a function as global data.
Have you misspelled the `data` argument in `ggplot()`

What is the best way to do this?


回答1:


You might plot one layer invisibly and then a filtered layer using data = . %>% filter(...:

df %>%
  filter(var3 < 80) %>%
  filter(var2 != 56) %>%
  mutate(var2 = as.numeric(var2)) %>%
  ggplot(aes(x=group,y=var1,color=var2)) + 
  geom_point(alpha = 0) +
  geom_point(data = . %>% filter(year %in% dlist[i]))




回答2:


You can use scale_color_gradient and set the limits of your scale:

df %>%
    filter(var3 < 80 & var2 != 56) %>%
    mutate(var2 = as.numeric(var2)) %>%
    filter(year %in% dlist[i]) %>%   # so I can filter for year here, but that makes the colorbar in the ggplot scale for this subset individually, which is no good. 
    ggplot(aes(x=group,y=var1,color=var2)) + 
    geom_point()+
    scale_color_gradient(limits = c(min(df$var2),max(df$var2)))


来源:https://stackoverflow.com/questions/59957340/filter-a-piped-df-within-ggplot

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!