问题
Given is the following python script:
text = '<?xml version="1.24" encoding="utf-8">'
mu = (".??[?]?[?]", "....")
for item in mu:
print item,":",re.search(item, text).group()
Can someone please explain why the first hit with the regex .??[?]?[?]
returns <?
instead of just ?
.
My explaination:
.??
should match nothing as.?
can match or not any char and the second?
makes it not greedy.[?]?
can match?
or not, so nothing is good, too[?]
just matches?
That should result in ?
and not in <?
回答1:
For the same reason o*?bar
matches oobar
in foobar
. Even if the quantifier is non-greedy the regex will try to match from the first char in all possible ways, before moving on to the next.
First the .??
matches an empty string, but when the regex engine backtracks to it, it matches <
, thus making the rest of the regex match, without moving the start position of the match to the next character.
回答2:
Regex "greediness" only affects backtracking; it doesn't mean that the regex engine will skip earlier potential match points — a regex always takes the first possible match. In this case, that means <?
because it starts farther to the left than ?
.
来源:https://stackoverflow.com/questions/6589226/regular-expression-matches-more-than-expected