问题
Shown as below:
df <- data.frame(X1 = rep(letters[1:3],3),
X2 = 1:9,
X3 = sample(1:50,9))
df
ind<- grep("a|c", df$X1)
library(data.table)
df_ac <- df[ind,]
df_b <- df[!ind,]
df_ac
is created using the regular grep
command. If I want to use the grep
the reverse way: to select all observations with X1 == 'b'
.
I know I can do this by:
ind2<- grep("a|c", df$X1, invert = T)
df_b <-df[ind2,]
But, in my original script, why does the command df_b <-df[!ind,]
return a data frame with zero observation?
Anyone can explain to me why my logic here is wrong? Is there any other way to select observations in a data.frame by using the grep
reversely without specifying invert = T
? Thank you!
回答1:
You may be more interested in grepl
instead of grep
:
ind<- grepl("a|c", df$X1)
df[ind,]
# X1 X2 X3
# 1 a 1 16
# 3 c 3 38
# 4 a 4 10
# 6 c 6 18
# 7 a 7 33
# 9 c 9 49
df[!ind,]
# X1 X2 X3
# 2 b 2 5
# 5 b 5 14
# 8 b 8 50
Alternatively, go ahead an make use of "data.table" and try out %in%
to see what else might work for you. Notice the difference in the syntax.
ind2 <- c("a", "c")
library(data.table)
setDT(df)
df[X1 %in% ind2]
# X1 X2 X3
# 1: a 1 16
# 2: c 3 38
# 3: a 4 10
# 4: c 6 18
# 5: a 7 33
# 6: c 9 49
df[!X1 %in% ind2]
# X1 X2 X3
# 1: b 2 5
# 2: b 5 14
# 3: b 8 50
来源:https://stackoverflow.com/questions/36091366/filtering-observations-by-using-grep-the-reverse-way-in-r