using hit highlighter in lucene

后端 未结 1 383
醉梦人生
醉梦人生 2020-12-06 03:50

I have two questions regarding hit highlighter provided with apache lucene:

  1. see this function could you explain the use of token stream parameter.

相关标签:
1条回答
  • 2020-12-06 04:21

    EDIT: added some details about explain().

    Some general introduction: The Lucene Highlighter is meant to find text snippets from a hit document, and to highlight tokens matching the query.

    1. Therefore, The TokenStream parameter is used to break the hit text into tokens. The highlighter's scorer then scores each token, in order to score fragments and choose snippets and tokens to be highlighted.
    2. I believe you are doing it wrong. If all you want to do is understand which query terms were matched in the document, you should use the explain() method. Basically, after you have instantiated a searcher, use:

    Explanation expl = searcher.explain(query, docId);

    String asText = expl.toString();

    String asHtml = expl.toHtml();

    docId is the raw document id from the search results.

    Only if you do need the snippets and/or highlights, you should use the Highlighter. If you still want to use the highlighter, follow Nicholas Hrychan's advice. Beware, though, as he describes the Lucene 2.4.1 API - If you use a more advanced version, you should use "QueryScorer" where he says "SpanScorer" .

    0 讨论(0)
提交回复
热议问题