问题
I am using a dplyr pipeline to clean my df then feed directly into a ggplot. However, I want to plot only one group at a time, so I need to filter to just that group. The problem is, I want the scales to remain constant as if all groups are present. Is it possible to further filter a piped df inside the ggplot() commands? Ex below.
# create df
set.seed(1)
df <- data.frame(matrix(nrow=100,ncol=5))
colnames(df) <- c("year","group","var1","var2","var3")
df$year <- rep(1:4,each=25)
df$group <- rep(c("a","b","c","d","e"),times=20)
df$var1 <- runif(100,min=0,max=30)
df$var2 <- sample(1:500,100,replace=T)
df$var2[1:25] <- sample(1:100,25,replace = T)
df$var3 <- runif(100,min=0,max=100)
Now pipe it to clean it (here we're just doing some random stuff to it), then plot:
df %>%
filter(var3 < 80) %>% # random thing 1 - filter some stuff
filter(var2 < 400) %>% # random thing 2 - filter more
mutate(var2 = as.numeric(var2)) %>% # random thing 3 - mutate a column
ggplot(aes(x=group,y=var1,color=var2)) +
geom_point()
So I want to only plot one year at a time (from the "year" column), but I want to do it in a way in which I can plot each year in a loop, but keep the colorbar scaled to the full df values.
Here's what I tried so far :
dlist <- c(1:4) #list of years
i <- 2 #current year
df %>%
filter(var3 < 80) %>%
filter(var2 != 56) %>%
mutate(var2 = as.numeric(var2)) %>%
filter(year %in% dlist[i]) %>% # so I can filter for year here, but that makes the colorbar in the ggplot scale for this subset individually, which is no good.
ggplot(aes(x=group,y=var1,color=var2)) +
geom_point()
I think there should be a way to use .
and %>%
within the ggplot
parentheses so that the scale remains... but I can't quite figure it out.
dlist <- c(1:4) #list of years
i <- 2 #current year
df %>%
filter(var3 < 80) %>%
filter(var2 != 56) %>%
mutate(var2 = as.numeric(var2)) %>%
ggplot(data = .%>%filter(year %in% dlist[i]), aes(x=group,y=var1,color=var2)) +
geom_point()
but that gives me this error:
Error: You're passing a function as global data.
Have you misspelled the `data` argument in `ggplot()`
What is the best way to do this?
回答1:
You might plot one layer invisibly and then a filtered layer using data = . %>% filter(...
:
df %>%
filter(var3 < 80) %>%
filter(var2 != 56) %>%
mutate(var2 = as.numeric(var2)) %>%
ggplot(aes(x=group,y=var1,color=var2)) +
geom_point(alpha = 0) +
geom_point(data = . %>% filter(year %in% dlist[i]))
回答2:
You can use scale_color_gradient
and set the limits of your scale:
df %>%
filter(var3 < 80 & var2 != 56) %>%
mutate(var2 = as.numeric(var2)) %>%
filter(year %in% dlist[i]) %>% # so I can filter for year here, but that makes the colorbar in the ggplot scale for this subset individually, which is no good.
ggplot(aes(x=group,y=var1,color=var2)) +
geom_point()+
scale_color_gradient(limits = c(min(df$var2),max(df$var2)))
来源:https://stackoverflow.com/questions/59957340/filter-a-piped-df-within-ggplot