R: igraph, community detection, edge.betweenness method, count/list members of each community?

ぃ、小莉子 提交于 2019-12-02 22:37:40

A couple of these questions can be discovered by closely looking at the documentation of the functions you're using. For instance, the documentation of clusters, in the "Values" section, describes what will be returned from the function, a couple of which answer your questions. Documentation aside, you can always use the str function to analyze the make-up of any particular object.

That being said, to get the members or numbers of members in a particular community, you can look at the membership object returned by the clusters function (which you're already using to assign color). So something like:

summary(clusters(all2)$membership)

would describe the IDs of the clusters that are being used. In the case of your sample data, it looks like you have clusters with the IDs ranging from 0 to 585, for 586 clusters in total. (Note that you won't be able to display those very accurately using the coloring scheme you're currently using.)

To determine the number of vertices in each cluster, you can look at the csize component also returned by clusters. In this case, it's a vector of length 586, storing one size for each cluster calculated. So you can use

clusters(all2)$csize

to get the list of sizes of your clusters. Be warned that your clusterIDs, as previously mentioned, start from 0 ("zero-indexed") whereas R vectors start from 1 ("one-indexed"), so you'll need to shift these indices by one. For instance, clusters(all2)$csize[5] returns the size of the cluster with the ID of 4.

To list the vertices in any cluster, you just want to find which IDs in the membership component previously mentioned match up to the cluster in question. So if I want to find the vertices in cluster #128 (there are 21 of these, according to clusters(all2)$csize[129]), I could use:

which(clusters(all2)$membership == 128)
length(which(clusters(all2)$membership == 128)) #21

and to retrieve the vertices in that cluster, I can use the V function and pass in the indices which I just computed which are a member of that cluster:

> V(all2)[clusters(all2)$membership == 128]
Vertex sequence:
 [1] "625591221 - Clare Clancy"           
 [2] "100000283016052 - Podge Mooney"     
 [3] "100000036003966 - Jennifer Cleary"  
 [4] "100000248002190 - Sarah Dowd"       
 [5] "100001269231766 - LirChild Surfwear"
 [6] "100000112732723 - Stephen Howard"   
 [7] "100000136545396 - Ciaran O Hanlon"  
 [8] "1666181940 - Evion Grizewald"       
 [9] "100000079324233 - Johanna Delaney"  
[10] "100000097126561 - Órlaith Murphy"   
[11] "100000130390840 - Julieann Evans"   
[12] "100000216769732 - Steffan Ashe"     
[13] "100000245018012 - Tom Feehan"       
[14] "100000004970313 - Rob Sheahan"      
[15] "1841747558 - Laura Comber"          
[16] "1846686377 - Karen Ni Fhailliun"    
[17] "100000312579635 - Anne Rutherford"  
[18] "100000572764945 - Lit Đ Jsociety"   
[19] "100003033618584 - Fall Ball"        
[20] "100000293776067 - James O'Sullivan" 
[21] "100000104657411 - David Conway"

That would cover the basic igraph questions you had. The other questions are more graph-theory related. I don't know of a way to supervise the number of clusters to be created using iGraph, but someone may be able to point you to a package which is able to do that. You may have more success posting that as a separate question, either here or in another venue.

Regarding your first points of wanting to iterate through all possible communities, I think you'll find that to be unfeasible for a graph of significant size. The number of possible arrangements of the membership vector for 5 different clusters would be 5^n, where n is the size of the graph. If you want to find "all possible communities", that number will actually be O(n^n), if my mental math is correct. Essentially, it would be impossible to calculate that exhaustively over any reasonably size network, even given massive computational resources. So I think you'll be better off using some sort of intelligence/optimization for determining the number of communities represented in your graph, as the clusters function does.

In regard to "how to control the number of communities" in OPs question, I use the cut_at function on the communities to cut the resulting hierarchical structure into a desired number of groups. I hope someone can confirm that I am doing something sane. Namely, consider the following:

#Generate graph
adj.mat<- matrix(,nrow=200, ncol=200) #empty matrix
set.seed(2) 

##populate adjacency matrix
for(i in 1:200){adj.mat[i,sample(rep(1:200), runif(1,1,100))]<-1}
adj.mat[which(is.na(adj.mat))] <-0

for(i in 1:200){
  adj.mat[i,i]<-0
}

G<-graph.adjacency(adj.mat, mode='undirected')
plot(G, vertex.label=NA)

##Find clusters
walktrap.comms<- cluster_walktrap(G, steps=10)
max(walktrap.comms$membership) #43

  [1]  6 34 13  1 19 19  3  9 20 29 12 26  9 28  9  9  2 14 13 14 27  9 33 17 22 23 23 10 17 31  9 21  2  1
 [35] 33 23  3 26 22 29  4 16 24 22 25 31 23 23 13 30 35 27 25 15  6 14  9  2 16  7 23  4 18 10 10 22 27 27
 [69] 23 31 27 32 36  8 23  6 23 14 19 22 19 37 27  6 27 22  9 14  4 22 14 32 33 27 26 14 21 27 22 12 20  7
[103] 14 26 38 39 26  3 14 23 22 14 40  9  5 19 29 31 26 26  2 19  6  9  1  9 23  4 14 11  9 22 23 41 10 27
[137] 22 18 26 14  8 15 27 10  5 33 21 28 23 22 13  1 22 24 14 18  8  2 18  1 27 12 22 34 13 27  3  5 27 25
[171]  1 27 13 34  8 10 13  5 17 17 25  6 19 42 31 13 30 32 15 30  5 11  9 25  6 33 18 33 43 10

Now, note that there are 43 groups but we want coarser cuts hence, examine the dendrogram:

plot(as.hclust(walktrap.comms), label=F)

And cut based on it. I arbitrarily chose 6 cuts but nevertheless, you now have coarser clusters

cut_at(walktrap.comms, no=6)

  [1] 4 2 5 4 5 5 3 5 3 4 3 5 5 3 5 5 3 1 5 1 1 5 1 6 1 1 1 4 6 5 5 2 3 4 1 1 3 5 1 4 6 6 3 1 5 5 1 1 5 4 3 1
 [53] 5 2 4 1 5 3 6 3 1 6 6 4 4 1 1 1 1 5 1 4 3 3 1 4 1 1 5 1 5 2 1 4 1 1 5 1 6 1 1 4 1 1 5 1 2 1 1 3 3 3 1 5
[105] 3 3 5 3 1 1 1 1 3 5 2 5 4 5 5 5 3 5 4 5 4 5 1 6 1 3 5 1 1 1 4 1 1 6 5 1 3 2 1 4 2 1 2 3 1 1 5 4 1 3 1 6
[157] 3 3 6 4 1 3 1 2 5 1 3 2 1 5 4 1 5 2 3 4 5 2 6 6 5 4 5 3 5 5 4 4 2 4 2 3 5 5 4 1 6 1 2 4
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!