Match/group duplicate rows (indices)

前端 未结 2 2002
情深已故
情深已故 2021-02-05 09:47

How can I efficiently match/group the indices of duplicated rows?

Let\'s say I have this data set:

set.seed(14)
dat <- data.frame(mtc         


        
2条回答
  •  鱼传尺愫
    2021-02-05 10:39

    We can use dplyr. Using a similar methodology as @AnandaMahto's post, we create a row index column name (add_rownames(), group by all the columns, we filter the dataset with number of rows in each group greater than 1, summarise the 'rowname' to a list and extract that list column.

    library(dplyr)
    add_rownames(dat) %>% 
          group_by_(.dots= names(dat)) %>% 
          filter(n()>1) %>%
          summarise(rn= list(rowname))%>%
          .$rn
     #[[1]]
     #[1] "3"  "7"  "8"  "10" "11"
    
     #[[2]]
     #[1] "2"  "13"
    
     #[[3]]
     #[1] "1" "4" "5" "6" "9"
    

提交回复
热议问题