Lucene query fails with mixed MUST/MUST_NOT

前端 未结 1 582
轻奢々
轻奢々 2020-12-03 16:30

Given a document with this text, indexed in a field named Content:

The dish ran away with the spoon.

The following query fails to match tha

相关标签:
1条回答
  • 2020-12-03 16:54

    Lucene doesn't start with a full view of everything, like a SQL database. Lucene starts with no documents matched, and finds things based on the clauses searched on. This is why:

    -Content:xyz
    

    On it's own doesn't really work. It knows not to bring in content:xyz, but hasn't been given any documents to match. The same is true of your query, because it's placed in a subquery.

    -Content:xyz is evaluated first, which gets no docs on it's own. So then you have, effectively

    +Content:dish +(no documents)
    

    It's useful to think of - as an AND NOT rather than simply a NOT (though don't take that to imply the +/- and AND/OR/NOT syntax necessarily map to each other directly).

    If you want to be able to execute a lonely negative query like that, you need to bring in all documents first. The MatchAllDocsQuery is the best way to accomplish that, something like:

    BooleanQuery query = new BooleanQuery();
    query.add(new BooleanClause(new MatchAllDocsQuery(), BooleanClause.Occur.SHOULD));
    query.add(new BooleanClause(new TermQuery(new Term("Content","xyz")), BooleanClause.Occur.MUST_NOT));
    

    Would be the equivalent of a SQL style query with only a negation for a WHERE clause.

    Of course, this isn't really necessary in the case you've listed since:

    +Content:dish -Content:xyz
    

    Is perfectly adequate.

    0 讨论(0)
提交回复
热议问题