How can I remove all duplicates so that NONE are left in a data frame?

前端 未结 3 1496
忘了有多久
忘了有多久 2020-11-22 08:31

There is a similar question for PHP, but I\'m working with R and am unable to translate the solution to my problem.

I have this data frame with 10 rows and 50 column

3条回答
  •  不思量自难忘°
    2020-11-22 09:10

    This will extract the rows which appear only once (assuming your data frame is named df):

    df[!(duplicated(df) | duplicated(df, fromLast = TRUE)), ]
    

    How it works: The function duplicated tests whether a line appears at least for the second time starting at line one. If the argument fromLast = TRUE is used, the function starts at the last line.

    Boths boolean results are combined with | (logical 'or') into a new vector which indicates all lines appearing more than once. The result of this is negated using ! thereby creating a boolean vector indicating lines appearing only once.

提交回复
热议问题