Count how many vertices in a vertex's neighbourhood have an attribute in igraph for R

最后都变了- 提交于 2020-01-15 10:39:28

问题


I have a large graph (several, actually) in igraph—on the order of 100,000 vertices—and each vertex has an attribute which is either true or false. For each vertex, I would like to count how many of the vertices directly connected to it have the attribute. My current solution is the following function, which takes as its argument a graph.

attrcount <- function(g) {
  nb <- neighborhood(g,order=1)
  return(sapply(nb,function(x) {sum(V(g)$attr[x]}))
}

This returns a vector of counts which is off by 1 for vertices which have the attribute, but I can adjust this easily.

The problem is that this runs incredibly slowly, and it seems like there should be a fast way to do this, since, for instance, computing the degree of each vertex is practically instantaneous with degree(g).

Am I doing this a stupid way?

As an example, suppose this was our graph.

set.seed(42)
g <- erdos.renyi.game(169081, 178058, type="gnm")
V(g)$att <- as.logical(rbinom(vcount(g), 1, 0.5))

回答1:


Use get.adjlist to query all adjacent vertices, and then sapply (or tapply might be even faster) on this list to get the counts. It is also worth storing the attribute in a vector, because then you don't need to extract it all the time.

With sapply

system.time({
  al <- get.adjlist(g)
  att <- V(g)$att
  res <- sapply(al, function(x) sum(att[x]))
})
#   user  system elapsed 
#  0.571   0.005   0.576 

With tapply

system.time({
  al <- get.adjlist(g)
  alv <- unlist(al)
  alf <- factor(rep(seq_along(al), sapply(al, length)),
                levels=seq_along(al))
  att <- V(g)$att
  res2 <- tapply(att[alv], alf, sum)
  res2[is.na(res2)] <- 0
})
#   user  system elapsed 
#  1.121   0.020   1.144 

all(res == res2)
# TRUE

Somewhat a surprise to me, but the tapply solution is actually slower.

If this is still not enough, then I guess you can still make it faster by writing it in C/C++.




回答2:


For faster computation, use get.adjacency to pull the adjacency matrix, then multiply the matrix by the attribute vector using %*%:

library(igraph)
set.seed(42)
g <- erdos.renyi.game(1000, 1000, type = "gnm")
V(g)$att <- as.logical(rbinom(vcount(g), 1, 0.5))

system.time({
  ma   <- get.adjacency(g)
  att  <- V(g)$att
  res1 <- as.numeric(ma %*% att)
})
#  user  system elapsed 
# 0.003   0.000   0.003 

Compared to using get.adjlist and sapply:

system.time({
  al   <- get.adjlist(g)
  att  <- V(g)$att
  res2 <- sapply(al, function(x) sum(att[x]))
})
#   user  system elapsed 
#  9.733   0.243  10.107

After modifying the class of res1, the results vector is identical:

res1 <- as.numeric(res1)
identical(res1, res2)
# [1] TRUE


来源:https://stackoverflow.com/questions/23155365/count-how-many-vertices-in-a-vertexs-neighbourhood-have-an-attribute-in-igraph

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!