Load Neo4J in memory on demand for heavy computations

烈酒焚心 提交于 2019-12-11 04:48:41

问题


How could I load Neo4J into memory on demand?

On different stages of my long running jobs I'm persisting nodes and relationships to Neo4J. So Neo4J should be on disk, since it may consume too much memory and I don't know when I gonna run read queries against it.

But at some point (only once) I will want to run pretty heavy read query against my Neo4J server, and it have very poor performance (hours). As a solution I want to load all Neo4J to RAM for better performance.

What is the best option for it? Should I use run disk or there are any better solutions?

P.S.

Query with [r:LINK_REL_1*2] works pretty fast, [r:LINK_REL_1*3] works 17 seconds, [r:LINK_REL_1*4] works more than 5 minutes, even do not know how much, since I have 5 minutes timeout. But I need [r:LINK_REL_1*2..4] query to perform in reasonable time.

My heavy query explanation

PROFILE
MATCH path = (start:COLUMN)-[r:LINK_REL_1*2]->(col:COLUMN) 
WHERE start.ENTITY_ID = '385' 
WITH path UNWIND NODES(path) AS col
WITH path, 
COLLECT(DISTINCT col.DATABASE_ID) as distinctDBs
WHERE LENGTH(path) + 1 = SIZE(distinctDBs)
RETURN path

Updated query with explanation (got the same performance in tests)

PROFILE
MATCH (start:COLUMN)
WHERE start.ENTITY_ID = '385' 
MATCH path = (start)-[r:LINK_REL_1*2]->(col:COLUMN)
WITH path, REDUCE(dbs = [], col IN NODES(path) | 
  CASE WHEN col.DATABASE_ID in dbs 
       THEN dbs 
       ELSE dbs + col.DATABASE_ID END) as distinctDbs
WHERE LENGTH(path) + 1 = SIZE(distinctDbs)
RETURN path


回答1:


APOC procedures has apoc.warmup.run(), which may get much of Neo4j into cached memory. See if that will make a difference.




回答2:


It looks like you're trying to create a query in which the path contains only :Persons from distinct countries. Is this right?

If so, I think we can find a better query that can do this without hanging.

First, let's go for low-hanging fruit and see if avoiding the UNWIND can make a difference.

PROFILE or EXPLAIN the query and see if any numbers look significantly different compared to the original query.

MATCH (start:PERSON)
WHERE start.ID = '385' 
MATCH path = (start)-[r:FRIENDSHIP_REL*2..5]->(person:PERSON)
WITH path, REDUCE(countries = [], person IN NODES(path) | 
  CASE WHEN person.country in countries 
       THEN countries 
       ELSE countries + person.COUNTRY_ID END) as distinctCountries
WHERE LENGTH(path) + 1 = SIZE(distinctCountries)
RETURN path


来源:https://stackoverflow.com/questions/42148577/load-neo4j-in-memory-on-demand-for-heavy-computations

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!