I have a reasonable number of nodes (roughly 60,000)
(:Document {title:\"A title\"})
Given a title, I want to find the matching node, if it exi
In Neo4j 3.5 (currently on beta03), there are FTS (Full-Text Search) capabilities.
EDIT : I have written a detailed blog post about FTS in Neo4j : https://graphaware.com/neo4j/2019/01/11/neo4j-full-text-search-deep-dive.html
You can query then your documents using the Lucene Classic Query Parser Syntax.
Create the index :
CALL db.index.fulltext.createNodeIndex('documents', ['Document'], ['title','text'])
Import some documents :
LOAD CSV WITH HEADERS FROM "file:///docs.csv" AS row
CREATE (n:Document) SET n = row
Query documents with title containing "heavy toll"
CALL db.index.fulltext.queryNodes('documents', 'title: "heavy toll"')
YIELD node, score
RETURN node.title, score
│"node.title" │"score" │
│"Among Deaths in 2016, a Heavy Toll in Pop Music - The New York Times"│3.7325966358184814│
Query for same title with a typo :
CALL db.index.fulltext.queryNodes('documents', 'title: \\"heavy~ tall~\\"')
YIELD node, score
RETURN node.title, score
Notice the escaping of the quotes => \" , the string passed to the underlying parser should contain the quotes in order to perform a phrase query instead of a boolean query.
Also the tidle
next to the terms indicate to perform a fuzzy search using the Damarau-Levenshtein algo.
│"node.title" │"score" │
│"Among Deaths in 2016, a Heavy Toll in Pop Music - The New York Times"│0.868073046207428 │
│"Prisons Run by C.E.O.s? Privatization Under Trump Could Carry a Heavy│0.4014900326728821 │
│ Price - The New York Times" │ │
│"‘All Talk,’ ‘No Action,’ Says Trump in Twitter Attack on Civil Rights│0.28181418776512146 │
│ Icon - The New York Times" │ │
│"Immigrants Head to Washington to Rally While Obama Is Still There - T│0.24634429812431335 │
│he New York Times" │ │