Python regex subsitution: separate backreference from digit

浪子不回头ぞ 提交于 2019-12-09 02:20:05

问题


In a regex replacement pattern, a backreference looks like \1. If you want to include a digit after that backreference, this will fail because the digit is considered to be part of the backreference number:

# replace all twin digits by zeroes, but retain white space in between
re.sub(r"\d(\s*)\d", r"0\10", "0 1")
>>> sre_constants.error: invalid group reference

Substitution pattern r"0\1 0" would work fine but in the failing example back-reference \1 is interpreted as \10.

How can the digit '0' be separated from the back-reference \1 that precedes it?


回答1:


You can use \g<1>, as mentioned in the docs.




回答2:


Instead of using a backreference with a sequence number (\1), you can use named groups and the problem is solved:

# replace all twin digits by zeroes, but retain whitespace in between
re.sub(r"\d(?P<whitespace>\s*)\d", r"0\g<whitespace>0", "0 1")
>>> '0 0'

Turns out this trick is in fact described in the documentation of re.sub.



来源:https://stackoverflow.com/questions/16810523/python-regex-subsitution-separate-backreference-from-digit

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!