Regexp grouping and replaceAll with .* in Java duplicates the replacement

前端 未结 3 2020
故里飘歌
故里飘歌 2021-01-18 17:28

I got a problem using Rexexp in Java. The example code writes out ABC_012_suffix_suffix, I was expecting it to output ABC_012_suffix



        
相关标签:
3条回答
  • 2021-01-18 17:36
    Pattern regexp  = Pattern.compile(".*");
    Matcher matcher = regexp.matcher("ABC_012");
    matcher.matches();
    System.out.println(matcher.group(0));
    System.out.println(matcher.replaceAll("$0_suffix"));
    

    Same happens here, the output is:

    ABC_012
    ABC_012_suffix_suffix
    

    The reason is hidden in the replaceAll method: it tries to find all subsequences that match the pattern:

    while (matcher.find()) {
      System.out.printf("Start: %s, End: %s%n", matcher.start(), matcher.end());
    }
    

    This will result in:

    Start: 0, End: 7
    Start: 7, End: 7
    

    So, to our first surprise, the matcher finds two subsequences, "ABC_012" and another "". And it appends "_suffix" to both of them:

    "ABC_012" + "_suffix" + "" + "_suffix"
    
    0 讨论(0)
  • 2021-01-18 17:40

    If you just want to add "_suffix" to your input why don't you just do:

    String result = "ABC_012" + "_suffix";
    

    ?

    0 讨论(0)
  • 2021-01-18 18:00

    Probably .* gives you "full match" and then reduces match to the "empty match" (but still a match). Try (.+) or (^.*$) instead. Both work as expected.

    At regexinfo star is defined as follows:

    *(star) - Repeats the previous item zero or more times. Greedy, so as many items as possible will be matched before trying permutations with less matches of the preceding item, up to the point where the preceding item is not matched at all.

    0 讨论(0)
提交回复
热议问题