问题
I have the two following data frames (example):
df1:
name profile type strand
A 4.5 1 +
B 3.2 1 +
C 5.5 1 +
D 14.0 1 -
E 45.1 1 -
F 32.8 1 -
G 19.9 1 +
df2:
name
A
B
C
G
I would like to delete the rows in df1
for which df1$name = df2$name
to get the following:
Output:
name profile type strand
D 14.0 1 -
E 45.1 1 -
F 32.8 1 -
If anyone could tell me which piece of code to use it would be a lot of help, seemed simple at first but I've been messing it up since yesterday.
回答1:
You need the %in%
operator. So,
df1[!(df1$name %in% df2$name),]
should give you what you want.
df1$name %in% df2$name
tests whether the values indf1$name
are indf2$name
- The
!
operator reverses the result.
回答2:
This is sometimes called an anti-join:
library(dplyr)
anti_join(df1, df2, by = "name")
回答3:
df1[!(as.character(df1$jobId) %in% as.character(df2$name)), ]
I had to add as.character
to my execution because name
is not a character but a factor instead. Isn't %in%
supposed to convert this directly?
来源:https://stackoverflow.com/questions/17338411/delete-rows-that-exist-in-another-data-frame