Removing data from one dataframe that exists in another dataframe R

前端 未结 3 593
一整个雨季
一整个雨季 2021-02-10 03:31

I want to remove data from a dataframe that is present in another dataframe. Let me give an example:

letters<-c(\'a\',\'b\',\'c\',\'d\',\'e\')
numbers<-c(1         


        
相关标签:
3条回答
  • 2021-02-10 03:32

    Base R Solution

    list_one[!list_one$letters %in% list_two$letters2,]
    

    gives you:

      letters numbers
    2       b       2
    5       e       5
    

    Explanation:

    > list_one$letters %in% list_two$letters2
    [1]  TRUE FALSE  TRUE  TRUE FALSE
    

    This gives you a vector of LENGTH == length(list_one$letters) with TRUE/FALSE Values. ! negates this vector. So you end up with FALSE/TRUE values if the value is present in list_two$letters2.

    If you have questions about how to select rows from a data.frame enter

    ?`[.data.frame`
    

    to the console and read it.

    0 讨论(0)
  • 2021-02-10 03:40

    A dplyr solution

    library(dplyr)
    
    list_one %>% anti_join(list_two)
    
    0 讨论(0)
  • 2021-02-10 03:58

    Answer is response to your edit: " so I really can't use the negative expression".

    I guess one of the most efficient ways to do this is using data.table as follows:

    require(data.table)
    setDT(list_one)
    setDT(list_two)
    list_one[!list_two, on=c(letters = "letters2")]
    

    Or

    require(data.table)
    setDT(list_one, key = "letters")
    setDT(list_two, key = "letters2")
    list_one[!letters2]
    

    (Thanks to Frank for the improvement)

    Result:

       letters numbers
    1:       b       2
    2:       e       5
    

    Have a look at ?"data.table" and Quickly reading very large tables as dataframes in R on why to use data.table::freadto read the csv-files in the first place.

    BTW: If you have letters2 instead of list_two you can use

    list_one[!J(letters2)]
    
    0 讨论(0)
提交回复
热议问题