Remove duplicated rows

前端未结

关注

 11  1775

清酒与你 2020-11-22 00:00

I have read a CSV file into an R data.frame. Some of the rows have the same element in one of the columns. I would like to remove rows that are duplicates in th

11条回答

忘了有多久 (楼主)

2020-11-22 00:56
Remove duplicate rows of a dataframe
```
library(dplyr)
mydata <- mtcars

# Remove duplicate rows of the dataframe
distinct(mydata)
```
In this dataset, there is not a single duplicate row so it returned same number of rows as in mydata.

Remove Duplicate Rows based on a one variable
```
library(dplyr)
mydata <- mtcars

# Remove duplicate rows of the dataframe using carb variable
distinct(mydata,carb, .keep_all= TRUE)
```
The .keep_all function is used to retain all other variables in the output data frame.

Remove Duplicate Rows based on multiple variables
```
library(dplyr)
mydata <- mtcars

# Remove duplicate rows of the dataframe using cyl and vs variables
distinct(mydata, cyl,vs, .keep_all= TRUE)
```
The .keep_all function is used to retain all other variables in the output data frame.

(from: http://www.datasciencemadesimple.com/remove-duplicate-rows-r-using-dplyr-distinct-function/ )
0 讨论(0)

查看其它11个回答
发布评论:

提交评论
- 加载中...