Why does (.*)* make two matches and select nothing in group $1?

前端 未结 1 2000
不知归路
不知归路 2021-02-04 01:31

This arose from a discussion on formalizing regular expressions syntax. I\'ve seen this behavior with several regular expression parsers, hence I tagged it language-agnostic.

1条回答
  •  不思量自难忘°
    2021-02-04 02:10

    Let's see what happens:

    1. (.*) matches "input".
    2. "input" is captured into group 1.
    3. The regex engine is now positioned at the end of the string. But since (.*) is repeated, another match attempt is made:
    4. (.*) matches the empty string after "input".
    5. The empty string is captured into group 1, overwriting "input".
    6. $1 now contains the empty string.

    A good question from the comments:

    Then why does replace("input", "(input)*", "A$1B") return "AinputBAB"?

    1. (input)* matches "input". It is replaced by "AinputB".
    2. (input)* matches the empty string. It is replaced by "AB" ($1 is empty because it didn't participate in the match).
    3. Result: "AinputBAB"

    0 讨论(0)
提交回复
热议问题