Grouping a many-to-many relationship from a two-column map

后端 未结 3 1555
无人及你
无人及你 2021-01-13 16:45

I have a SQL table that maps, say, authors and books. I would like to group linked authors and books (books written by the same author, and authors who co-wrote a book) toge

3条回答
  •  爱一瞬间的悲伤
    2021-01-13 17:26

    Here's a go re-hashing my answer to an old question of mine that Josh O'Brien linked in the comments ( identify groups of linked episodes which chain together ). This answer uses the igraph library.

    # Dummy data that might be easier to interpret to show it worked
    # Authors 1,2 and 3,4 should group. author 5 is a group to themselves
    aubk <- data.frame(author_id=c(1,2,3,4,5),book_id=c(1,1,2,2,5))
    
    # identify authors with a bit of leading text to prevent clashes 
    # with the book ids
    aubk$author_id2 <- paste0("au",aubk$author_id)
    
    library(igraph)
    #create a graph - this needs to be matrix input
    au_graph <- graph.edgelist(as.matrix(aubk[c("author_id2","book_id")]))
    # get the ids of the authors
    result <- data.frame(author_id=names(au_graph[1]),stringsAsFactors=FALSE)
    # get the corresponding group membership of the authors
    result$group <- clusters(au_graph)$membership
    
    # subset to only the authors data
    result <- result[substr(result$author_id,1,2)=="au",]
    # make the author_id variable numeric again
    result$author_id <- as.numeric(substr(result$author_id,3,nchar(result$author_id)))
    
    > result
      author_id group
    1         1     1
    3         2     1
    4         3     2
    6         4     2
    7         5     3
    

提交回复
热议问题