I have a graph which contains an unknown number of disconnected subgraphs. What's a good algorithm (or Java library) to find them all?
I think what you are looking for is generally called a Flood Fill. It is up to you whether you traverse the graph through a BFS or a DFS.
Basically you take an unlabeled (AKA uncoloured) node and assign a new label to it. You assign the same label to all nodes adjacent to that one, and so on to all nodes that are reachable from that node.
When no more reachable nodes can be labeled, you start over by picking another unlabeled node. Notice that the fact that this new node is unlabeled implies that it is not reachable from our earlier node and is thus in a different disconnected component.
When there are no more unlabeled nodes, the number of distinct labels you had to use is the number of components of the graph. The label for each node tells you which node is part of which component.
Not a Java implementation but perhaps it will be useful for someone, here is how to do it in Python:
import networkx as nx
g = nx.Graph()
# add nodes/edges to graph
d = list(nx.connected_component_subgraphs(g))
# d contains disconnected subgraphs
# d[0] contains the biggest subgraph
More information here.
There are many facets of this question that are not fully explained, so I'm going to give a somewhat exhaustive answer. My tendency to post walls-of-text notwithstanding. :/ Note also that I'm assuming this is a homework question or self-education question, so I'm not going to give a straight-up answer.
The two basic algorithms for detecting connectivity of a graph is Depth First Search and Breadth First Search. Those are really the 2 starting points you want to look at. Both will get you to the solution, but in different ways, and its hard to argue which is "better" without considering some pretty in-depth aspects of the problem. But let's move on.
As I mentioned before, you left out some important details, and I'll touch upon some possibilities here.
Is your graph directed or undirected? and do you consider connectivity in the "strong" sense (in which case, see oggy's answer), or connectivity in the "weak" sense? Depending on your answer, you will have to approach your algorithm in a subtly different way. Note that, for an undirected graph, weak and strong connectivity are equivalent, so that's nice. But you'll have to keep the structure of the graph in mind regardless, while implementing or finding an algorithm.
Furthermore, there is the question of what you mean by "finding the subgraphs" (paraphrase). Usually graph connectivity is a decision problem -- simply "there is one connected graph" or "there are two or more sub-graphs (aka, it's disconnected)". Having an algorithm for that requires the least amount of bookwork, which is nice. :) The next step up would be the count of graphs, literally the number of them, and that bookwork is also not so bad. Penultimately, you may want a list of nodes in each sub-graph. Lastly, you may want to literally copy out the sub-graphs, edges and all (so the return type would be a list of graphs, I suppose, with an implied invariant that each graph is connected). None of these different result-types would require a different algorithm, but will certainly require a different approach to the bookwork.
All of this may seem like a ridiculous amount of overkill for what is a pretty basic question, but I thought I'd just highlight all of the aspects involved in even such a simple graph question. As a sort of cliffhanger, note how none of this even yet touches upon runtime or memory usage! :) - Agor
I'm assuming you want to find all the (strongly) connected components? For that you can use Tarjan's algorithm (a variant of DFS)
What about a breadth first search to find all connected nodes? Once you have the list of connected nodes, you subtract this list from the list of all nodes. You end up with a list of disconnected nodes
来源:https://stackoverflow.com/questions/1348783/finding-all-disconnected-subgraphs-in-a-graph