问题
I have this dataframe that I'll like to subset (if possible, with dplyr
or base R
functions):
df <- data.frame(x = c(1,1,1,2,2,2), y = c(30,10,8,10,18,5))
x y
1 30
1 10
1 8
2 10
2 18
2 5
Assuming x are factors (so 2 conditions/levels), how can I subset/filter this dataframe so that I get only df$y
values that are greater than 15 for df$x == 1
, and df$y
values that are greater than 5 for df$x == 2
?
This is what I'd like to get:
df2 <- data.frame(x = c(1,2,2), y = c(30,10,18))
x y
1 30
2 10
2 18
Appreciate any help! Thanks!
回答1:
you can try this
with(df, df[ (x==1 & y>15) | (x==2 & y>5), ])
x y
1 1 30
4 2 10
5 2 18
or with dplyr
library(dplyr)
filter(df, (x==1 & y>15) | (x==2 & y>5))
回答2:
If you have several 'x' groups, one option would be to use mapply
. We split
the 'y' using 'x' as grouping variable, create the vector of values to compare against (c(15,5)
) and use mapply
to get the logical index for subsetting the 'df'.
df[unlist(mapply('>', split(df$y, df$x), c(15,5))),]
# x y
#1 1 30
#4 2 10
#5 2 18
来源:https://stackoverflow.com/questions/30037199/how-to-filter-dataframe-with-multiple-conditions