Remove duplicated rows

前端 未结 11 1775
清酒与你
清酒与你 2020-11-22 00:00

I have read a CSV file into an R data.frame. Some of the rows have the same element in one of the columns. I would like to remove rows that are duplicates in th

11条回答
  •  忘了有多久
    2020-11-22 00:56

    Remove duplicate rows of a dataframe

    library(dplyr)
    mydata <- mtcars
    
    # Remove duplicate rows of the dataframe
    distinct(mydata)
    

    In this dataset, there is not a single duplicate row so it returned same number of rows as in mydata.



    Remove Duplicate Rows based on a one variable

    library(dplyr)
    mydata <- mtcars
    
    # Remove duplicate rows of the dataframe using carb variable
    distinct(mydata,carb, .keep_all= TRUE)
    

    The .keep_all function is used to retain all other variables in the output data frame.



    Remove Duplicate Rows based on multiple variables

    library(dplyr)
    mydata <- mtcars
    
    # Remove duplicate rows of the dataframe using cyl and vs variables
    distinct(mydata, cyl,vs, .keep_all= TRUE)
    

    The .keep_all function is used to retain all other variables in the output data frame.

    (from: http://www.datasciencemadesimple.com/remove-duplicate-rows-r-using-dplyr-distinct-function/ )

提交回复
热议问题