Lucene 4.2 StringField

前端 未结 1 1119
面向向阳花
面向向阳花 2021-01-21 02:45

I\'m new to Lucene. I have two documents and I would like to have an exact match for the document field called \"keyword\" (the field may occur multiple times within a documen

1条回答
  •  小蘑菇
    小蘑菇 (楼主)
    2021-01-21 03:38

    You problem is not in how you are indexing the field. The string field is the correct way to index the entire input as a single token. The problem is how you are searching. I really don't know what you are intending to accomplish with this logic, really.

    BooleanQuery qry = new BooleanQuery();
    qry.add(new TermQuery(new Term("keyword", "\"Annotation is cool\"")), BooleanClause.Occur.MUST);
    //Great! You have a termQuery added to the parent BooleanQuery which should find your keyword just fine!
    
    Query q = new QueryParser(Version.LUCENE_42, "title", analyzer).parse(qry.toString());
    //Now all bets are off.
    

    Query.toString() is a handy method of debugging, but it is not safe to assume that running the output text query through a QueryParser will regenerate the same query. The standard query parser really doesn't have much capability to express multiple words as a single term. The String version of this that you see will, I believe, look like:

    keyword:"Annotation is cool"
    

    Which will be interpreted as a PhraseQuery. A PhraseQuery will look for three consecutive terms, Annotation, is, and cool, But the way you have indexed this, you have a single term "Annotation is cool".

    The solution is don't ever use logic like

     Query nuttyQuery = queryParser.parse(perfectlyGoodQuery.toString());
     searcher.search(nuttyQuery);
    

    Instead, just search with the BooleanQuery you already created.

     searcher.search(perfectlyGoodQuery);
    

    0 讨论(0)
提交回复
热议问题