Java Regular Expression Matcher doesn't find all possible matches

前端 未结 2 1639
遥遥无期
遥遥无期 2021-01-21 06:53

I was looking at a code at TutorialsPoint and something has been bothering me since then... take a look at this code :

import java.util.regex.Matcher;
import jav         


        
2条回答
  •  北海茫月
    2021-01-21 07:30

    It's because of the greediness of * and there comes the backtracking.

    String :

    This order was placed for QT3000! OK?
    

    Regex:

    (.*)(\\d+)(.*)
    

    We all know that .* is greedy and matches all characters as much as possible. So the first .* matches all the characters upto the last character that is ? and then it backtracks in-order to provide a match. The next pattern in our regex is \d+, so it backtracks upto a digit. Once it finds a digit, \d+ matches that digit because the condition is satisfied here (\d+ matches one or more digits). Now the first (.*) captures This order was placed for QT300 and the following (\\d+) captures the digit 0 located just before to the ! symbol.

    Now the next pattern (.*) captures all the remaining characters that is !OK?. m.group(1) refers to the characters which are present inside the group index 1 and m.group(2) refers to the index 2, like that it goes on.

    See the demo here.

    To get your desired output.

    String line = "This order was placed for QT3000! OK?";
      String pattern = "(.*)(\\d{2})(.*)";
    
      // Create a Pattern object
      Pattern r = Pattern.compile(pattern);
    
      // Now create matcher object.
      Matcher m = r.matcher(line);
      while(m.find( )) {
         System.out.println("Found value: " + m.group(1));
         System.out.println("Found value: " + m.group(2));
         System.out.println("Found value: " + m.group(3));
      }
    

    Output:

    Found value: This order was placed for QT30
    Found value: 00
    Found value: ! OK?
    

    (.*)(\\d{2}), backtracks upto two digits in-order to provide a match.

    Change your pattern to this,

    String pattern = "(.*?)(\\d+)(.*)";
    

    To get the output like,

    Found value: This order was placed for QT
    Found value: 3000
    Found value: ! OK?
    

    ? after the * forces the * to do a non-greedy match.

    Use extra captuing groups to get the outputs from a single program.

    String line = "This order was placed for QT3000! OK?";
    String pattern = "((.*?)(\\d{2}))(?:(\\d{2})(.*))";
    Pattern r = Pattern.compile(pattern);
          Matcher m = r.matcher(line);
          while(m.find( )) {
             System.out.println("Found value: " + m.group(1));
             System.out.println("Found value: " + m.group(4));
             System.out.println("Found value: " + m.group(5));
             System.out.println("Found value: " + m.group(2));
             System.out.println("Found value: " + m.group(3) + m.group(4));
             System.out.println("Found value: " + m.group(5));
         }
    

    Output:

    Found value: This order was placed for QT30
    Found value: 00
    Found value: ! OK?
    Found value: This order was placed for QT
    Found value: 3000
    Found value: ! OK?
    

提交回复
热议问题