R\'s duplicated
returns a vector showing whether each element of a vector or data frame is a duplicate of an element with a smaller subscript. So if rows 3, 4,
I had a similar problem but I needed to identify duplicated rows by values in specific columns. I came up with the following dplyr solution:
df <- df %>%
group_by(Column1, Column2, Column3) %>%
mutate(Duplicated = case_when(length(Column1)>1 ~ "Yes",
TRUE ~ "No")) %>%
The code groups the rows by specific columns. If the length of a group is greater than 1 the code marks all of the rows in the group as duplicated. Once that is done you can use Duplicated
column for filtering etc.