问题
I am using a dictionary that contain regular expressions to substitute portions of different strings, as elegantly described in a previous SO question by @roippi. The first 're.sub' expression works perfectly. However, whenever my code actually involves regex expressions (the second 're.sub' expression), the substitutions don't work.
I am very confused as to why this is the case. I have tried both using and taking out the 'r' as well as incorporating the lookahead/lookbehind expressions, nothing seems to work. Any help would be greatly appreciated!
test_dict = {r'(\d+)': 'THIS IS A NUMBER', 'john_doe':'THIS IS A NAME'}
re.sub('(john_doe)', lambda x: test_dict.get(x.group(1),x.group(1)),'john_doe_jr')
re.sub(r'(\d+)', lambda x: test_dict.get(x.group(1), x.group(1)), '999la')
回答1:
match.group(n)
does not return the regular expression that was used to match the nth group, but the nth group itself.
The lambda therefore returns test_dict.get('999', '999')
, which returns '999'
, because '999'
is not a key in your dictionary.
You could iterate over the keys of the dictionary and check if any key matches your capture group, and then replace it, but that has O(n) time complexity (in the size of the dictionary).
def replacement(match, d, group=1):
for key in d:
if re.match(key, match.group(group)):
return d[key]
return match.group(group)
re.sub(r'(\d+)', lambda x: replacement(x, test_dict), '999la')
来源:https://stackoverflow.com/questions/48085187/python-regex-sub-using-dictionary-with-regex-expressions