Can't get Python regex backreferences to work

て烟熏妆下的殇ゞ 提交于 2019-11-28 09:26:07

问题


I want to match the docstrings of a Python file. Eg.

r""" Hello this is Foo
     """

Using only """ should be enough for the start.

>>> data = 'r""" Hello this is Foo\n     """'
>>> def display(m):
...     if not m:
...             return None
...     else:
...             return '<Match: %r, groups=%r>' % (m.group(), m.groups())
...
>>> import re
>>> print display(re.match('r?"""(.*?)"""', data, re.S))
<Match: 'r""" Hello this is Foo\n     """', groups=(' Hello this is Foo\n     ',)>
>>> print display(re.match('r?(""")(.*?)\1', data, re.S))
None

Can someone please explain to me why the first expression matches and the other does not?


回答1:


You are using the escape sequence \1 instead of the backreference \1.

You can fix this by changing to escaping the \ before 1.

print display(re.match('r?(""")(.*?)\\1', data, re.S))

You can also fix it by using a raw string for your regex, with no escape sequences.

print display(re.match(r'r?(""")(.*?)\1', data, re.S))



回答2:


I think you might be missing the re.DOTALL or re.MULTILINE flags. In this case a re.DOTALL should allow your regex .*? to match newlines as well



来源:https://stackoverflow.com/questions/23071305/cant-get-python-regex-backreferences-to-work

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!