Exclude Similarities in List of Strings to extract the Difference

前端 未结 2 1337
無奈伤痛
無奈伤痛 2021-01-26 11:14

I have a list of sentences that are the same except for the Title of the book.

How can I loop through the list and exclude the similarities to find the title of the book

相关标签:
2条回答
  • 2021-01-26 11:52

    This was an interesting problem, so I've played around with it a little and came up with the following (cumbersome) solution:

    Find the first index where any of the sentences have a different char, then do the same in the reversed sentences, and then use Substring to extract only the different parts of the sentences:

    List<string> ExtractDifferences(List<string> sentences)
    {
        var firstDiffIndex = GetFirstDifferenceIndex(sentences);
        var lastDiffIndex = GetFirstDifferenceIndex(sentences.Select(s => new string(s.Reverse().ToArray())).ToList());
        return sentences.Select(s => s.Substring(firstDiffIndex, s.Length - lastDiffIndex - firstDiffIndex)).ToList();
    }
    
    
    int GetFirstDifferenceIndex(IList<string> strings)
    {
        int firstDifferenceIndex = int.MaxValue;
    
        for (int i = 0; i < strings.Count; i++)
        {
            var current = strings[i];
            var prev = strings[i == 0 ? strings.Count - 1 : i - 1];
    
            var firstDiffIndex = current
                .Select((c, j) => new { CurrentChar = c, Index = j })
                .FirstOrDefault(ci => ci.CurrentChar != prev[ci.Index])
                .Index;
    
            if (firstDiffIndex < firstDifferenceIndex)
            {
                firstDifferenceIndex = firstDiffIndex;
            }
        }
        return firstDifferenceIndex;
    }
    

    I guess the GetFirstDifferenceIndex method can be written differently, perhaps better using linq, but I don't have enough time to play with it.

    You can see a live demo on rextester.

    0 讨论(0)
  • 2021-01-26 11:59

    A working solution using LINQ:

    List<string> sentences = new List<string>() { };
    List<string> titles = new List<string>() { };
    
    sentences.Add("The book named Lord of the Flies is a classic.");
    sentences.Add("The book named To Kill a Mockingbird is a classic.");
    sentences.Add("The book named The Catcher in the Rye is a classic.");
    sentences.Add("Hello");
    sentences.Add("The book named ");
    
    
    titles = sentences.Where(sentence => sentence.Length > "The book named ".Length + " is a classic".Length)
                .GroupBy(sentence => sentence.Substring(0, 15), sentence => sentence.Remove(sentence.Length - " is a classic".Length).Substring("The book named ".Length))
                .Where(g => g.Key == "The book named ")
                .SelectMany(g => g)
                .ToList();
    
    foreach (var title in titles)
        WriteLine(title);
    

    First, it filters out sentences too short to meet the criteria, then groups the results by first 15 letters and extracts the titles with String.Remove.

    0 讨论(0)
提交回复
热议问题