How to detect duplicate words from a String in Java?

前端 未结 2 666
醉梦人生
醉梦人生 2020-12-18 16:51

What are the ways by which duplicate word in a String can be detected?

e.g. \"this is a test message for duplicate test\" contains one duplicate word test.

相关标签:
2条回答
  • 2020-12-18 17:34

    The following Java code resolves the problem of detecting duplicates from a String. There should not be any problem if the duplicate word is separated by newline or punctuation symbols.

        String duplicatePattern = "(?i)\\b(\\w+)\\b[\\w\\W]*\\b\\1\\b";
        Pattern p = Pattern.compile(duplicatePattern);
        String phrase = "this is#$;%@;<>?|\\` p is a is Test\n of duplicate test";
        Matcher m = p.matcher(phrase);
        String val = null;
        while (m.find()) {
            val = m.group();
            System.out.println("Matching segment is \"" + val + "\"");
            System.out.println("Duplicate word: " + m.group(1)+ "\n");
        }
    

    The output of the code will be:

    Matching segment is "is#$;%@;<>?|\` p is a is"
    Duplicate word: is
    
    Matching segment is "Test
     of duplicate test"
    Duplicate word: Test
    

    Here, m.group(1) statement represents the String matched against 1st group of Pattern [here, it's (\\w+)].

    0 讨论(0)
  • 2020-12-18 17:57

    The best you can do with regexes is O(N^2) search complexity. You can easily achieve O(N) time and space search complexity by splitting the input into words and using a HashSet to detect duplicates.

    0 讨论(0)
提交回复
热议问题