问题
I need help with a question closely related to some other question of mine.
How to merge two different groupings if they are not disjoint with dplyr
As the title of the question says, I want to generate an index in a vector that links different vectors in a list if they have an intersection or, if not, if both intersect with some other vector in a list, and so on...
This is a question involving graph theory/networks - I want to find indirectly connected vectors.
The question above solved my problem considering two columns a dataframe, but I don't know how to generalize this to a list in which elements my have different lengths.
This is an example: list(1:3, 3:5, 5, 6)
should give me c(1, 1, 1, 2)
EDIT:
I've tried using the fact that the powers of an adjacency matrix represent possible paths from one edge to some other one.
find_connections <- function(list_vectors){
list_vectors <- list_vectors %>%
set_names(paste0("x", 1:length(list_vectors)))
x <- crossprod(table(stack(list_vectors)))
power <- nrow(x) - 2
x <- ifelse(x >= 1, 1, 0)
if(power > 0){
z <- accumulate(replicate(power, x, simplify = FALSE),
`%*%`, .init = x) %>%
reduce(`+`)
} else{
z <- x
}
z <- ifelse(z >= 1, 1, 0)
w <- z %>%
as.data.frame() %>%
group_by(across()) %>%
group_indices()
return(w)
}
The problem is that it took too long to run my code. Each matrix is not very large, but I do need to run the function on a large number of them.
Is it possible to improve this?
回答1:
This is one way to do it. It creates a loop for the elements in each vector and then uses the same trick as the previous answer to find clusters.
library(data.table)
library(igraph)
x <- list(1:3, 3:5, 5, 6)
dt <- rbindlist(lapply(x,
function(r) data.table(from = r, to = shift(r, -1, fill = r[1]))))
dg <- graph_from_data_frame(dt, directed = FALSE)
unname(sapply(x, function(v) components(dg)$membership[as.character(v[1])]))
#> [1] 1 1 1 2
来源:https://stackoverflow.com/questions/63984825/assign-the-same-index-if-two-vectors-have-a-common-intersection