I have a graph of the form:
(products:Product)-[:in_stock { updated: timestamp }]->(stock_items:StockItem { quantity: q })-[:stored_at]->(locations:Location)
<Consider the graph data model below. The days are connected in a linked list but they contain timestamps. If I want to collect the Stats
nodes between a range, I must first select on those day nodes and then I can select the Stats
nodes that are in purple. From there I can specify that those purple nodes must be connected to the Group
node in yellow that is connected to theLocation
that I specify.
Now if I translate this pattern into Cypher, I get the following:
MATCH (d:Day)
WHERE d.timestamp > 123456789 AND d.timestamp < 234567891
MATCH (topic:Topic), (location:Location { city: "San Francisco" })
WHERE topic.name in ["NoSQL"]
WITH topic, location, day
MATCH (topic)<-[:HAS_TOPIC]-(group:Group)-[:LOCATED_IN]->(location)
WITH DISTINCT group, day
MATCH (group)-[:HAS_MEMBERS]->(stats:Stats)-[:ON_DAY]->(day)
WITH DISTINCT (day.month + "/" + day.day + "/" + day.year) as day,
group.name as group,
stats.count as members,
day.timestamp as timestamp
ORDER BY timestamp
RETURN day, group, members
If you refactored your model to turn the in_stock
relationship into a node with a timestamp, and model that node as a linked list, then you could select the most recent by specifying the pattern:
MATCH (product:Product { sku: 1234 })-[:HAS_UPDATE]->(update:InStock)
WHERE NOT (update)-[:NEXT]->()
WITH update
MATCH (update)-[:STOCK_ITEMS]->(stockItems:StockItem),
(stockItems)<-[:STORED_AT]-(location:Location)
RETURN location.name, stockItems.quantity
This is the most performant way to do this. To manage pointers in a linked list that allow you to both query on a range (between timestamps) and to also query the N most recent items.