问题
How could I load Neo4J into memory on demand?
On different stages of my long running jobs I'm persisting nodes and relationships to Neo4J. So Neo4J should be on disk, since it may consume too much memory and I don't know when I gonna run read queries against it.
But at some point (only once) I will want to run pretty heavy read query against my Neo4J server, and it have very poor performance (hours). As a solution I want to load all Neo4J to RAM for better performance.
What is the best option for it? Should I use run disk or there are any better solutions?
P.S.
Query with [r:LINK_REL_1*2]
works pretty fast, [r:LINK_REL_1*3]
works 17 seconds, [r:LINK_REL_1*4]
works more than 5 minutes, even do not know how much, since I have 5 minutes timeout. But I need [r:LINK_REL_1*2..4]
query to perform in reasonable time.
My heavy query explanation
PROFILE
MATCH path = (start:COLUMN)-[r:LINK_REL_1*2]->(col:COLUMN)
WHERE start.ENTITY_ID = '385'
WITH path UNWIND NODES(path) AS col
WITH path,
COLLECT(DISTINCT col.DATABASE_ID) as distinctDBs
WHERE LENGTH(path) + 1 = SIZE(distinctDBs)
RETURN path
Updated query with explanation (got the same performance in tests)
PROFILE
MATCH (start:COLUMN)
WHERE start.ENTITY_ID = '385'
MATCH path = (start)-[r:LINK_REL_1*2]->(col:COLUMN)
WITH path, REDUCE(dbs = [], col IN NODES(path) |
CASE WHEN col.DATABASE_ID in dbs
THEN dbs
ELSE dbs + col.DATABASE_ID END) as distinctDbs
WHERE LENGTH(path) + 1 = SIZE(distinctDbs)
RETURN path
回答1:
APOC procedures has apoc.warmup.run(), which may get much of Neo4j into cached memory. See if that will make a difference.
回答2:
It looks like you're trying to create a query in which the path contains only :Persons from distinct countries. Is this right?
If so, I think we can find a better query that can do this without hanging.
First, let's go for low-hanging fruit and see if avoiding the UNWIND can make a difference.
PROFILE or EXPLAIN the query and see if any numbers look significantly different compared to the original query.
MATCH (start:PERSON)
WHERE start.ID = '385'
MATCH path = (start)-[r:FRIENDSHIP_REL*2..5]->(person:PERSON)
WITH path, REDUCE(countries = [], person IN NODES(path) |
CASE WHEN person.country in countries
THEN countries
ELSE countries + person.COUNTRY_ID END) as distinctCountries
WHERE LENGTH(path) + 1 = SIZE(distinctCountries)
RETURN path
来源:https://stackoverflow.com/questions/42148577/load-neo4j-in-memory-on-demand-for-heavy-computations