How to remove rows in a dataframe considering there are duplicates in one column of dataframe

前端未结

关注

 2  1442

Hi dear I have a little problem with a dataframe that has duplicates in a column. I would like to remove the rows where a column presents duplicates. For example my datafram

相关标签:

2条回答

一个人的身影

2021-01-16 09:19
The use of which should only be done with its "positive" version. The danger in using the construction -which() is that when none of the rows or items match the test, the result of the which() is numeric(0) and -numeric(0) will return 'nothing', when the correct result is 'everything'. Use use:
```
 dat[!duplicated(dat), ]  
```
In this case there were no duplicated rows, but the OP thought that some should be removed so obviously it was only two or three columns were under consideration. This is easy to accommodate. Just do the duplication test on 2 or three columns:
```
 dat[ !duplicated(dat[ , 2:3] ) , ]
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
没有蜡笔的小新

2021-01-16 09:23
Use the function duplicated.

Something like:
```
data.subset <- data[!duplicated(data$ID),]
```
Duplicated returns a true/false vector. The second duplicated entry in the vector will always return TRUE.
0 讨论(0)
发布评论:

提交评论
- 加载中...