Neo4j: how do I delete all duplicate relationships in the database through cypher?

前端 未结 2 794
被撕碎了的回忆
被撕碎了的回忆 2021-01-03 00:46

I have a huge database with a ton of nodes (10mil+). There is only one type of relationship in the whole database. However, there are a ton of nodes that have duplicated rel

2条回答
  •  小鲜肉
    小鲜肉 (楼主)
    2021-01-03 01:03

    What error do you get with the db global query in the linked SO question? Try substituting | for : in the FOREACH, that's the only breaking syntax difference that I can see. The 2.x way to say the same thing, except adapted to your having only one relationship type in the db, might be

    MATCH (a)-[r]->(b)
    WITH a, b, TAIL (COLLECT (r)) as rr
    FOREACH (r IN rr | DELETE r)
    

    I think the WITH pipe will carry the empty tails when there is no duplicate, and I don't know how expensive it is to loop through an empty collection–my sense is that the place to introduce the limit is with a filter after the WITH, something like

    MATCH (a)-[r]->(b)
    WITH a, b, TAIL (COLLECT (r)) as rr
    WHERE length(rr) > 0 LIMIT 100000
    FOREACH (r IN rr | DELETE r)
    

    Since this query doesn't touch properties at all (as opposed to yours, which returns properties for (a) and (b)) I don't think it should be very memory heavy for a medium graph like yours, but you will have to experiment with the limit.

    If memory is still a problem, then if there is any way for you to limit the nodes to work with (without touching properties), that's also a good idea. If your nodes are distinguishable by label, try running the query for one label at the time

    MATCH (a:A)-[r]->(b) //etc..
    MATCH (a:B)-[r]->(b) //etc..
    

提交回复
热议问题