Finding ALL duplicate rows, including “elements with smaller subscripts”

后端未结

关注

 7  744

借酒劲吻你 2020-11-21 07:55

R\'s duplicated returns a vector showing whether each element of a vector or data frame is a duplicate of an element with a smaller subscript. So if rows 3, 4,

7条回答

Happy的楠姐 (楼主)

2020-11-21 08:25
I had a similar problem but I needed to identify duplicated rows by values in specific columns. I came up with the following dplyr solution:
```
df <- df %>% 
  group_by(Column1, Column2, Column3) %>% 
  mutate(Duplicated = case_when(length(Column1)>1 ~ "Yes",
                            TRUE ~ "No")) %>%
  ungroup()
```
The code groups the rows by specific columns. If the length of a group is greater than 1 the code marks all of the rows in the group as duplicated. Once that is done you can use Duplicated column for filtering etc.
0 讨论(0)

查看其它7个回答
发布评论:

提交评论
- 加载中...