Getting top n records for each group in neo4j

后端 未结 3 906
心在旅途
心在旅途 2021-02-15 03:15

I need to group the data from a neo4j database and then to filter out everything except the top n records of every group.

Example:

I have two node

相关标签:
3条回答
  • 2021-02-15 03:45

    Try

    MATCH (o:Order)-[r:ADDED]->(a:Article)
    WITH o, r, a
    ORDER BY o.oid, r.t
    WITH o, COLLECT(a)[..2] AS topArticlesByOrder UNWIND topArticlesByOrder AS a
    RETURN a.aid AS articleId, COUNT(*) AS count
    

    Results look like

    articleId    count
       8           6
       2           2
       4           5
       7           2
       3           3
       6           5
       0           7
    

    on this sample graph created with

    FOREACH(opar IN RANGE(1,15) |
        MERGE (o:Order {oid:opar})
        FOREACH(apar IN RANGE(1,5) |
            MERGE (a:Article {aid:TOINT(RAND()*10)})
            CREATE o-[:ADDED {t:timestamp() - TOINT(RAND()*1000)}]->a
        )
    )
    
    0 讨论(0)
  • Use LIMIT combined with ORDER BY to get the top N of anything. For example, the top 5 scores would be:

    MATCH (node:MyScoreNode) 
    RETURN node
    ORDER BY node.score DESC
    LIMIT 5;
    

    The ORDER BY part ensures the highest scores show up first. The LIMIT gives you only the first 5, which since they're sorted, are always the highest.

    0 讨论(0)
  • 2021-02-15 03:50

    I tried to achieve your desired results and failed.

    So, my guess - this one is impossible with pure cypher.

    What is the problem? Cypher is considering everything as a paths. And actually is doing traverse.
    Trying to group results and then execute filter on each group means that cypher should somehow branch it traversing at some points. But Cypher executed filter on all results, because they are considered as collection of different paths.

    My suggestion - create several queries, that achieves desired functionality, and implement some client-side logic.

    0 讨论(0)
提交回复
热议问题