My current understanding of the python 3.4 regex library from the language reference does not seem to match up with my experiment results of the module.
You need to understand that each time you write a pattern, it is first interpreted as a string before to be read and interpreted a second time by the regex engine. Lets describe what happens:
>>> s='\r'
s contains the character CR.
>>> re.match('\r', s)
<_sre.SRE_Match object; span=(0, 1), match='\r'>
Here the string '\r'
is a string that contains CR, so a literal CR is given to the regex engine.
>>> re.match('\\r', s)
<_sre.SRE_Match object; span=(0, 1), match='\r'>
The string is now a literal backslash and a literal r, the regex engine receives these two characters and since \r
is a regex escape sequence that means a CR character too, you obtain a match too.
>>> re.match('\\\r', s)
<_sre.SRE_Match object; span=(0, 1), match='\r'>
The string contains a literal backslash and a literal CR, the regex engine receives \
and CR
, but since \CR
isn't a known regex escape sequence, the backslash is ignored and you obtain a match.
Note that for the regex engine, a literal backslash is the escape sequence \\
(so in a pattern string r'\\'
or '\\\\'
)