I have about 10000 replicates of a sample in a matrix. My matrix has 1000 rows and 6 columns. Numbers in the columns range from 0:58 depending on the sample.
Here is a data.table
implementation:
library(data.table)
dt <- data.table(new.matrix)
head(dt[, list(repeats=.N, id=.I[[1]]), by=names(dt)][order(repeats, decreasing=T)], 20)
# V1 V2 V3 V4 V5 V6 repeats id
# 1: 5 7 11 8 13 14 4 543
# 2: 5 11 13 5 10 14 4 579
# 3: 6 8 6 10 12 16 4 1433
# 4: 6 9 9 9 9 16 4 1688
# 5: 8 8 9 7 10 16 4 2382
# 6: 6 10 8 7 11 16 4 2965
# 7: 7 9 11 8 11 12 4 3114
# 8: 8 8 10 7 10 15 4 3546
# 9: 7 8 12 9 9 13 4 5759
# 10: 7 7 13 9 10 12 4 9021
# 11: 8 10 8 8 12 12 3 81
# 12: 9 10 7 7 11 14 3 110
# 13: 7 11 8 6 12 14 3 130
# 14: 11 9 7 7 9 15 3 143
# 15: 8 10 10 7 11 12 3 330
# 16: 8 9 10 8 13 10 3 480
# 17: 9 10 7 10 11 11 3 542
# 18: 8 6 11 9 11 13 3 555
# 19: 7 10 7 6 10 18 3 577
# 20: 7 8 10 5 12 16 3 601
where repeats
is how many times a row shows up, and id
the first row in the matrix that matches that pattern.