How to filter dataframe with multiple conditions?

我与影子孤独终老i 提交于 2021-02-04 05:57:07

问题


I have this dataframe that I'll like to subset (if possible, with dplyr or base R functions):

df <- data.frame(x = c(1,1,1,2,2,2), y = c(30,10,8,10,18,5))

x  y
1 30
1 10
1  8
2 10
2 18
2  5

Assuming x are factors (so 2 conditions/levels), how can I subset/filter this dataframe so that I get only df$y values that are greater than 15 for df$x == 1, and df$y values that are greater than 5 for df$x == 2?

This is what I'd like to get:

df2 <- data.frame(x = c(1,2,2), y = c(30,10,18))

x y
1 30
2 10
2 18

Appreciate any help! Thanks!


回答1:


you can try this

with(df, df[ (x==1 & y>15) | (x==2 & y>5), ])
  x  y
1 1 30
4 2 10
5 2 18

or with dplyr

library(dplyr)
filter(df, (x==1 & y>15) | (x==2 & y>5))



回答2:


If you have several 'x' groups, one option would be to use mapply. We split the 'y' using 'x' as grouping variable, create the vector of values to compare against (c(15,5)) and use mapply to get the logical index for subsetting the 'df'.

df[unlist(mapply('>', split(df$y, df$x), c(15,5))),]
#  x  y
#1 1 30
#4 2 10
#5 2 18


来源:https://stackoverflow.com/questions/30037199/how-to-filter-dataframe-with-multiple-conditions

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!