This arose from a discussion on formalizing regular expressions syntax. I\'ve seen this behavior with several regular expression parsers, hence I tagged it language-agnostic.
Let's see what happens:
(.*)
matches "input"
."input"
is captured into group 1
.(.*)
is repeated, another match attempt is made:(.*)
matches the empty string after "input"
.1
, overwriting "input"
.$1
now contains the empty string.A good question from the comments:
Then why does
replace("input", "(input)*", "A$1B")
return"AinputBAB"
?
(input)*
matches "input"
. It is replaced by "AinputB"
.(input)*
matches the empty string. It is replaced by "AB"
($1
is empty because it didn't participate in the match)."AinputBAB"