Highlighting whole sentence in Lucene.net 2.9.2

一世执手 提交于 2019-12-20 03:49:11

问题


Currently I'm working with the Lucene.net 2.9.2 framework. As a result of my search I would like to achieve result page (asp.net) with highlighted text fragment. I would like that the selected fragment is a whole sentence and not only few words.

For example if I have text:

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

and I'm searching for cupidatat I would like to get fragment:

Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

The code that I have right now is:

var scorer = new QueryScorer(q);
var formatter = new SimpleHTMLFormatter("<div>", "</div>");

var highlighter = new Highlighter(formatter, scorer);
highlighter.SetTextFragmenter(new SimpleFragmenter(100));

var fragments = highlighter.GetBestFragments(stream, text, 1);

but it returns only text range of size 100.

I will be thankful for any suggestion.


回答1:


You want to create a new Fragmenter (Similar to SimpleFragmenter). The function you need to adjust is:

public virtual bool IsNewFragment(Token token)
{
    bool isNewFrag = token.EndOffset() >= (fragmentSize * currentNumFrags);
    if (isNewFrag)
    {
        currentNumFrags++;
    }

    return isNewFrag;
}

This will likely need some adjustment until you get the correct logic, but that should give you a pretty good head start



来源:https://stackoverflow.com/questions/5549589/highlighting-whole-sentence-in-lucene-net-2-9-2

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!