C# Finding relevant document snippets for search result display

后端 未结 8 557
野性不改
野性不改 2021-02-04 12:07

In developing search for a site I am building, I decided to go the cheap and quick way and use Microsoft Sql Server\'s Full Text Search engine instead of something more robust l

8条回答
  •  独厮守ぢ
    2021-02-04 12:55

    I took another approach, perhaps it will help someone...

    First it searches if it word appears in my case with IgnoreCase (you change this of course yourself). Then I create a list of Regex matches on each separators and search for the first occurrence of the word (allowing partial case insensitive matches). From that index, I get the 10 matches in front and behind the word, which makes the snippet.

    public static string GetSnippet(string text, string word)
    {
        if (text.IndexOf(word, StringComparison.InvariantCultureIgnoreCase) == -1)
        {
            return "";
        }
    
        var matches = new Regex(@"\b(\S+)\s?", RegexOptions.Singleline | RegexOptions.Compiled).Matches(text);
    
        var p = -1;
        for (var i = 0; i < matches.Count; i++)
        {
            if (matches[i].Value.IndexOf(word, StringComparison.InvariantCultureIgnoreCase) != -1)
            {
                p = i;
                break;
            }
        }
    
        if (p == -1) return "";
        var snippet = "";
        for (var x = Math.Max(p - 10, 0); x < p + 10; x++)
        {
            snippet += matches[x].Value + " ";
        }
        return snippet;
    }
    

提交回复
热议问题