Neo4j Fast way to match fuzzy text property

前端 未结 1 1963
太阳男子
太阳男子 2021-02-09 21:51

I have a reasonable number of nodes (roughly 60,000)

(:Document {title:\"A title\"})

Given a title, I want to find the matching node, if it exi

相关标签:
1条回答
  • 2021-02-09 22:23

    In Neo4j 3.5 (currently on beta03), there are FTS (Full-Text Search) capabilities.

    EDIT : I have written a detailed blog post about FTS in Neo4j : https://graphaware.com/neo4j/2019/01/11/neo4j-full-text-search-deep-dive.html

    You can query then your documents using the Lucene Classic Query Parser Syntax.

    Create the index :

    CALL db.index.fulltext.createNodeIndex('documents', ['Document'], ['title','text'])
    

    Import some documents :

    LOAD CSV WITH HEADERS FROM "file:///docs.csv" AS row
    CREATE (n:Document) SET n = row
    

    Query documents with title containing "heavy toll"

    CALL db.index.fulltext.queryNodes('documents', 'title: "heavy toll"')
    YIELD node, score
    RETURN node.title, score
    
    ╒══════════════════════════════════════════════════════════════════════╤══════════════════╕
    │"node.title"                                                          │"score"           │
    ╞══════════════════════════════════════════════════════════════════════╪══════════════════╡
    │"Among Deaths in 2016, a Heavy Toll in Pop Music - The New York Times"│3.7325966358184814│
    └──────────────────────────────────────────────────────────────────────┴──────────────────┘
    

    Query for same title with a typo :

    CALL db.index.fulltext.queryNodes('documents', 'title: \\"heavy~ tall~\\"')
    YIELD node, score
    RETURN node.title, score
    

    Notice the escaping of the quotes => \" , the string passed to the underlying parser should contain the quotes in order to perform a phrase query instead of a boolean query.

    Also the tidle next to the terms indicate to perform a fuzzy search using the Damarau-Levenshtein algo.

    ╒══════════════════════════════════════════════════════════════════════╤═════════════════════╕
    │"node.title"                                                          │"score"              │
    ╞══════════════════════════════════════════════════════════════════════╪═════════════════════╡
    │"Among Deaths in 2016, a Heavy Toll in Pop Music - The New York Times"│0.868073046207428    │
    ├──────────────────────────────────────────────────────────────────────┼─────────────────────┤
    │"Prisons Run by C.E.O.s? Privatization Under Trump Could Carry a Heavy│0.4014900326728821   │
    │ Price - The New York Times"                                          │                     │
    ├──────────────────────────────────────────────────────────────────────┼─────────────────────┤
    │"‘All Talk,’ ‘No Action,’ Says Trump in Twitter Attack on Civil Rights│0.28181418776512146  │
    │ Icon - The New York Times"                                           │                     │
    ├──────────────────────────────────────────────────────────────────────┼─────────────────────┤
    │"Immigrants Head to Washington to Rally While Obama Is Still There - T│0.24634429812431335  │
    │he New York Times"                                                    │                     │
    ├──────────────────────────────────────────────────────────────────────┼─────────────────────┤
    
    0 讨论(0)
提交回复
热议问题