Neo4j Cypher return most consecutive “passes”

后端未结

关注

 2  999

深忆病人

I am trying to return from a graph database the students with the most consecutive passes to a series of exams.

Below is my current code but not sure where I can ta

相关标签:

2条回答

余生分开走

2020-12-22 07:40

You can do it with plain Cypher, but I don't think it's very practical - you essentially need to write a program with reduce.

Basically, the "split" works as follows: initialize an empty accumulator list and calculate streaks by iterating through the list of passes/fails, check whether the current element is the same as the previous one. For example ['pass', 'pass'] keeps the streak, ['pass', 'fail'] breaks it. If it breaks (like at the start of the list), append a new element to the accumulator. If it keeps, append a new element to the last element of the accumulator, e.g. with a new 'fail', [['pass', 'pass'], ['fail']] becomes [['pass', 'pass'], ['fail', 'fail]].

UNWIND
  [
    ['joe',  'pass'],
    ['matt', 'pass'],
    ['joe',  'fail'],
    ['matt', 'pass'],
    ['joe',  'pass'],
    ['matt', 'pass'],
    ['joe',  'pass'],
    ['matt', 'fail']
  ] AS row
WITH row[0] AS s, row[1] AS passed
WITH s, collect(passed) AS p
WITH s, reduce(acc = [], i IN range(0, size(p) - 1) | 
    CASE p[i] = p[i-1]
      WHEN true THEN [j IN range(0, size(acc) - 1) |
          CASE j = size(acc) - 1
            WHEN true THEN acc[j] + [p[i]]
            ELSE acc[j]
          END
        ]
      ELSE acc + [[p[i]]]
    END
  ) AS streaks // (1)
UNWIND streaks AS streak
WITH s, streak
WHERE streak[0] <> 'fail'
RETURN s, max(size(streak)) AS consecutivePasses // (2)

In step (1), this calculates streaks such as:

╒══════╤═════════════════════════════════╕
│"s"   │"streaks"                        │
╞══════╪═════════════════════════════════╡
│"matt"│[["pass","pass","pass"],["fail"]]│
├──────┼─────────────────────────────────┤
│"joe" │[["fail"],["pass","pass"]]       │
└──────┴─────────────────────────────────┘

And in (2), it gives:

╒══════╤═══════════════════╕
│"s"   │"consecutivePasses"│
╞══════╪═══════════════════╡
│"matt"│3                  │
├──────┼───────────────────┤
│"joe" │2                  │
└──────┴───────────────────┘

Of course, in this particular case it's not necessary to do the splitting: simply counting would be enough. But in 99% of practical situations, APOC is the way to go, so I did not bother optimising this solution.

0 讨论(0)

时光说笑

2020-12-22 07:42
This is a tricky one, and as far as I know can't be done with just Cypher, but there is a procedure in APOC Procedures that can help.

apoc.coll.split() takes a collection and a value to split around, and will yield records for each resulting sub-collection. Basically, we collect the ordered results per student, split around failures to get collections of consecutive passes, then get the max consecutive passes from the sizes of those collections:
```
MATCH (s:Student)-[r:TAKEN]->(e:Exam)
WITH s, r.score >= e.pass_mark as passed
ORDER BY e.date
WITH s, collect(passed) as resultsColl
CALL apoc.coll.split(resultsColl, false) YIELD value
WITH s, max(size(value)) as consecutivePasses
RETURN s.name as student, consecutivePasses
```
0 讨论(0)
发布评论:

提交评论
- 加载中...