I got a problem using Rexexp in Java. The example code writes out ABC_012_suffix_suffix
, I was expecting it to output ABC_012_suffix
Pattern regexp = Pattern.compile(".*");
Matcher matcher = regexp.matcher("ABC_012");
matcher.matches();
System.out.println(matcher.group(0));
System.out.println(matcher.replaceAll("$0_suffix"));
Same happens here, the output is:
ABC_012
ABC_012_suffix_suffix
The reason is hidden in the replaceAll
method: it tries to find
all subsequences that match the pattern:
while (matcher.find()) {
System.out.printf("Start: %s, End: %s%n", matcher.start(), matcher.end());
}
This will result in:
Start: 0, End: 7
Start: 7, End: 7
So, to our first surprise, the matcher finds two subsequences, "ABC_012"
and another ""
. And it appends "_suffix"
to both of them:
"ABC_012" + "_suffix" + "" + "_suffix"
If you just want to add "_suffix" to your input why don't you just do:
String result = "ABC_012" + "_suffix";
?
Probably .*
gives you "full match" and then reduces match to the "empty match" (but still a match). Try (.+)
or (^.*$)
instead. Both work as expected.
At regexinfo star is defined as follows:
*(star) - Repeats the previous item zero or more times. Greedy, so as many items as possible will be matched before trying permutations with less matches of the preceding item, up to the point where the preceding item is not matched at all.