Extract subgraph in neo4j

白昼怎懂夜的黑 提交于 2019-12-28 13:59:50

问题


I have a large network stored in Neo4j. Based on a particular root node, I want to extract a subgraph around that node and store it somewhere else. So, what I need is the set of nodes and edges that match my filter criteria.

Afaik there is no out-of-the-box solution available. There is a graph matching component available, but it works only for perfect matches. The Neo4j API itself defines only graph traversal which I can use to define which nodes/edges should be visited:

Traverser exp = Traversal
    .description()
    .breadthFirst()
    .evaluator(Evaluators.toDepth(2))
    .traverse(root);

Now, I can add all nodes/edges to sets for all paths, but this is very inefficient. How would you do it? Thanks!

EDIT Would it make sense to add the last node and the last relationship of each traversal to the subgraph?


回答1:


As for graph matching, that has been superseded by http://docs.neo4j.org/chunked/snapshot/cypher-query-lang.html which would fit nicely, and supports fuzzy matchin with optional relationships.

For subgraph representation, I would use the Cypher output to maybe construct new Cypher statements for recreating the graph, much like a SQL export, something like

start n=node:node_auto_index(name='Neo') 
match n-[r:KNOWS*]-m 
return "create ({name:'"+m.name+"'});"

http://console.neo4j.org/r/pqf1rp for an example




回答2:


I solved it by constructing the induced subgraph based on all traversal endpoints.

Building the subgraph from the set of last nodes and edges of every traversal does not work, because edges that are not part of any shortest paths would not be included.

The code snippet looks like this:

Set<Node> nodes = new HashSet<Node>();
Set<Relationship> edges = new HashSet<Relationship>();

for (Node n : traverser.nodes())
{
    nodes.add(n);
}

for (Node node : nodes)
{
    for (Relationship rel : node.getRelationships())
    {
        if (nodes.contains(rel.getOtherNode(node)))
            edges.add(rel);
    }
}

Every edge is added twice. One time for the outgoing node and one time for the incoming node. Using a Set, I can ensure that it's in the collection only once.

It is possible to iterate over incoming/outgoing edges only, but it is unclear how loops (edge from a node to itself) are handled. To which category do they belong to? This snippet does not have this issue.




回答3:


See dumping the database to cypher statements

dump START n=node({self}) MATCH p=(n)-[r:KNOWS*]->(m) RETURN n,r,m;

There's also an example for importing the subgraph of first database (db1) into a second (db2).



来源:https://stackoverflow.com/questions/16101789/extract-subgraph-in-neo4j

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!