问题
I am trying to work with some simple regex functions in Python. I am using regex to catch patterns in the Arabic alphabet, but it doesn't seem to be working in the simplest cases when one adds a few letters at the beginning of a pattern, regardless of whether there is a ligature or not:
>>> p = re.compile(r'ترينهايمان')
>>> p.match('به ترينهايمان')
>>>
>>> p = re.compile(r'ترینهایمان')
>>> p.match('بهترینهایمان')
>>>
The longer string is basically the pattern itself with two letters added at the beginning.
AFAIK, match should have returned a value, but it doesn't.
and it's curious because when you add a letter to the end of the pattern it catches it:
>>> p = re.compile(r'ترينهايمان')
>>> p.match('ترينهايماني')
<_sre.SRE_Match object at 0x02C52FA8>
>>> p.match('بهترينهايمان')
>>>
回答1:
re.match
will only match patterns that start at the beginning of the string:
re.match(pattern, string, flags=0)
If zero or more characters at the beginning of string match the regular expression pattern, return a corresponding MatchObject instance. Return None if the string does not match the pattern; note that this is different from a zero-length match.
Since you're trying to match a string with extra characters at the beginning, match
won't recognize the string as a match. You need to use re.search
instead.
来源:https://stackoverflow.com/questions/24771389/regex-match-fails-to-catch-a-simple-pattern-in-python