How python and the regex module handle backslashes

后端 未结 1 1763
予麋鹿
予麋鹿 2020-12-04 01:02

My current understanding of the python 3.4 regex library from the language reference does not seem to match up with my experiment results of the module.


My cu

相关标签:
1条回答
  • 2020-12-04 02:03

    You need to understand that each time you write a pattern, it is first interpreted as a string before to be read and interpreted a second time by the regex engine. Lets describe what happens:

    >>> s='\r'
    

    s contains the character CR.

    >>> re.match('\r', s)
    <_sre.SRE_Match object; span=(0, 1), match='\r'>
    

    Here the string '\r' is a string that contains CR, so a literal CR is given to the regex engine.

    >>> re.match('\\r', s)
    <_sre.SRE_Match object; span=(0, 1), match='\r'>
    

    The string is now a literal backslash and a literal r, the regex engine receives these two characters and since \r is a regex escape sequence that means a CR character too, you obtain a match too.

    >>> re.match('\\\r', s)
    <_sre.SRE_Match object; span=(0, 1), match='\r'>
    

    The string contains a literal backslash and a literal CR, the regex engine receives \ and CR, but since \CR isn't a known regex escape sequence, the backslash is ignored and you obtain a match.

    Note that for the regex engine, a literal backslash is the escape sequence \\ (so in a pattern string r'\\' or '\\\\')

    0 讨论(0)
提交回复
热议问题