Removing Only Adjacent Duplicates in Data Frame in R

前端 未结 3 755
南笙
南笙 2021-01-12 21:20

I have a data frame in R that is supposed to have duplicates. However, there are some duplicates that I would need to remove. In particular, I only want to

3条回答
  •  夕颜
    夕颜 (楼主)
    2021-01-12 21:47

    Try

     df[with(df, c(x[-1]!= x[-nrow(df)], TRUE)),]
    #   x  y
    #1  A  1
    #2  B  2
    #3  C  3
    #4  A  4
    #5  B  5
    #6  C  6
    #7  A  7
    #9  B  9
    #10 C 10
    

    Explanation

    Here, we are comparing an element with the element preceding it. This can be done by removing the first element from the column and that column compared with the column from which last element is removed (so that the lengths become equal)

     df$x[-1] #first element removed
     #[1] B C A B C A B B C
     df$x[-nrow(df)]
      #[1] A B C A B C A B B #last element `C` removed
    
     df$x[-1]!=df$x[-nrow(df)]
     #[1]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE
    

    In the above, the length is 1 less than the nrow of df as we removed one element. Inorder to compensate that, we can concatenate a TRUE and then use this index for subsetting the dataset.

提交回复
热议问题