How can I efficiently match/group the indices of duplicated rows?
Let\'s say I have this data set:
set.seed(14)
dat <- data.frame(mtc
We can use dplyr
. Using a similar methodology as @AnandaMahto's post, we create a row index column name (add_rownames(
), group by all the columns, we filter
the dataset with number of rows in each group greater than 1, summarise
the 'rowname' to a list
and extract that list
column.
library(dplyr)
add_rownames(dat) %>%
group_by_(.dots= names(dat)) %>%
filter(n()>1) %>%
summarise(rn= list(rowname))%>%
.$rn
#[[1]]
#[1] "3" "7" "8" "10" "11"
#[[2]]
#[1] "2" "13"
#[[3]]
#[1] "1" "4" "5" "6" "9"