I was looking at a code at TutorialsPoint and something has been bothering me since then... take a look at this code :
import java.util.regex.Matcher;
import jav
(.*?)(\\d+)(.*)
Make your *
greedy quantifier non greedy by putting *?
.
Because your first group (.*)
is greedy it will capture evrything and will leave just one 0
for \d
to capture.If you make it non greedy it will give you expected results.See demo.
https://regex101.com/r/tX2bH4/53
It's because of the greediness of *
and there comes the backtracking.
String :
This order was placed for QT3000! OK?
Regex:
(.*)(\\d+)(.*)
We all know that .*
is greedy and matches all characters as much as possible. So the first .*
matches all the characters upto the last character that is ?
and then it backtracks in-order to provide a match. The next pattern in our regex is \d+
, so it backtracks upto a digit. Once it finds a digit, \d+
matches that digit because the condition is satisfied here (\d+
matches one or more digits). Now the first (.*)
captures This order was placed for QT300
and the following (\\d+)
captures the digit 0
located just before to the !
symbol.
Now the next pattern (.*)
captures all the remaining characters that is !<space>OK?
. m.group(1)
refers to the characters which are present inside the group index 1 and m.group(2)
refers to the index 2, like that it goes on.
See the demo here.
To get your desired output.
String line = "This order was placed for QT3000! OK?";
String pattern = "(.*)(\\d{2})(.*)";
// Create a Pattern object
Pattern r = Pattern.compile(pattern);
// Now create matcher object.
Matcher m = r.matcher(line);
while(m.find( )) {
System.out.println("Found value: " + m.group(1));
System.out.println("Found value: " + m.group(2));
System.out.println("Found value: " + m.group(3));
}
Output:
Found value: This order was placed for QT30
Found value: 00
Found value: ! OK?
(.*)(\\d{2})
, backtracks upto two digits in-order to provide a match.
Change your pattern to this,
String pattern = "(.*?)(\\d+)(.*)";
To get the output like,
Found value: This order was placed for QT
Found value: 3000
Found value: ! OK?
?
after the *
forces the *
to do a non-greedy match.
Use extra captuing groups to get the outputs from a single program.
String line = "This order was placed for QT3000! OK?";
String pattern = "((.*?)(\\d{2}))(?:(\\d{2})(.*))";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(line);
while(m.find( )) {
System.out.println("Found value: " + m.group(1));
System.out.println("Found value: " + m.group(4));
System.out.println("Found value: " + m.group(5));
System.out.println("Found value: " + m.group(2));
System.out.println("Found value: " + m.group(3) + m.group(4));
System.out.println("Found value: " + m.group(5));
}
Output:
Found value: This order was placed for QT30
Found value: 00
Found value: ! OK?
Found value: This order was placed for QT
Found value: 3000
Found value: ! OK?