How to count unique combinations from a data table in R?

后端 未结 3 1561
遥遥无期
遥遥无期 2021-01-26 23:55

I have a data table with three columns. The first two are a collection of the data points (categorical data that can be either A, B, or C). The third column is a concatenation o

相关标签:
3条回答
  • 2021-01-27 00:17
    dt1[,paste(sort(c(CAT1,CAT2)),collapse=" & "),by=1:nrow(dt1)][,table(V1)]
    
    0 讨论(0)
  • 2021-01-27 00:17

    You can also do something like this- Note- As mentioned by @chinsoon12, we can use pmin & pmax

     > setDT(dt1)[,list(Count=.N) ,paste(pmin(CAT1, CAT2), pmax(CAT1, CAT2), sep=' & ')]
       paste Count
    1: a & a     1
    2: b & b     2
    3: c & c     2
    4: a & b     2
    5: a & c     3
    
    0 讨论(0)
  • 2021-01-27 00:32

    I'm no good with data.table, so here's my answer with a data.frame:

    Just sort the two CATs before pasting, making sure they're always in the same order.

     dt1$merged<-apply(dt1,1,function(x) paste(sort(x),collapse=" & "))
    

    I'm sure there's a faster way to do in with data.table, but I'm not sure how. A naive sort added to your code came up with an error...

    0 讨论(0)
提交回复
热议问题