Matcher not finding overlapping words?

前端 未结 4 453
猫巷女王i
猫巷女王i 2021-01-28 08:05

I\'m trying to take a string:

String s = \"This is a String!\";

And return all 2-word pairs within that string. Namely:

{\"this         


        
相关标签:
4条回答
  • 2021-01-28 08:28

    I tried with group of pattern.

    String s = "this is a String";
    
    Pattern pat = Pattern.compile("([^ ]+)( )([^ ]+)");
    Matcher mat = pat.matcher(s);
    boolean check = mat.find();
    while(check){
        System.out.println(mat.group());
        check = matPOS.find(mat.start(3));
    }
    

    from the pattern ([^ ]+)( )([^ ]+)
    ...........................|_______________|
    ..................................group(0)
    ..........................|([^ ]+)| <--group(1)
    ......................................|( )| <--group(2)
    ............................................|([^ ]+)| <--group(3)

    0 讨论(0)
  • 2021-01-28 08:33

    I like the two answers already posted, counting words and subtracting one, but if you just need a regex to find overlapping matches:

    Pattern pattern = Pattern.compile('\\S+ \\S+');
    Matcher matcher = pattern.matcher(inputString);
    int matchCount = 0;
    boolean found = matcher.find();
    while (found) {
      matchCount += 1;
      // search starting after the last match began
      found = matcher.find(matcher.start() + 1);
    }
    

    In reality, you'll need to be a little more clever than simply adding 1, since trying this on "the force" will match "he force" and then "e force". Of course, this is overkill for counting words, but this may prove useful if the regex is more complicated than that.

    0 讨论(0)
  • 2021-01-28 08:35

    Total pair count = Total number of words - 1

    And you already know how to count total number of words.

    0 讨论(0)
  • 2021-01-28 08:35

    Run a for loop from i = 0 to the number of words - 2, then the words i and i+1 will make up a single 2-word string.

    String[] splitString = string.split(" ");
    for(int i = 0; i < splitString.length - 1; i++) {
        System.out.println(splitString[i] + " " + splitString[i+1]);
    }
    

    The number of 2-word strings within a sentence is simply the number of words minus one.

    int numOfWords = string.split(" ").length - 1;
    
    0 讨论(0)
提交回复
热议问题