Summarize the self-join index while avoiding cartesian product in R data.table

前端 未结 3 542
南旧
南旧 2021-01-15 07:09

With a 2-column data.table, I\'d like to summarize the pairwise relationships in column 1 by summing the number of shared elements in column 2. In other words,

3条回答
  •  不思量自难忘°
    2021-01-15 07:31

    You already have solution written in SQL so I suggest R package sqldf

    Here's code:

    library(sqldf)
    
    result <- sqldf("SELECT A.X, B.X, COUNT(A.Y) as N FROM test as A JOIN test as B WHERE A.Y==B.Y GROUP BY A.X, B.X")
    

提交回复
热议问题