I want to match the docstrings of a Python file. Eg.
r""" Hello this is Foo
"""
Using only """
should be enough for the start.
>>> data = 'r""" Hello this is Foo\n """'
>>> def display(m):
... if not m:
... return None
... else:
... return '<Match: %r, groups=%r>' % (m.group(), m.groups())
...
>>> import re
>>> print display(re.match('r?"""(.*?)"""', data, re.S))
<Match: 'r""" Hello this is Foo\n """', groups=(' Hello this is Foo\n ',)>
>>> print display(re.match('r?(""")(.*?)\1', data, re.S))
None
Can someone please explain to me why the first expression matches and the other does not?
You are using the escape sequence \1
instead of the backreference \1
.
You can fix this by changing to escaping the \
before 1
.
print display(re.match('r?(""")(.*?)\\1', data, re.S))
You can also fix it by using a raw string for your regex, with no escape sequences.
print display(re.match(r'r?(""")(.*?)\1', data, re.S))
I think you might be missing the re.DOTALL
or re.MULTILINE
flags. In this case a re.DOTALL
should allow your regex .*?
to match newlines as well
来源:https://stackoverflow.com/questions/23071305/cant-get-python-regex-backreferences-to-work