Find all matches of a pattern and replace them in a text

耗尽温柔 提交于 2020-11-30 00:12:22

问题


I have a pattern as below:

measurement = re.compile("(\d+(?:\.\d*)?)\s*x\s*(\d+(?:\.\d*)?)\s*(cm|mm|millimeter|centimeter|millimeters|centimeters)")

It can be seen in several times in a sentence and in a document. I want to find all matches and replace it with "MEASUREMENT", also I want to add its value in a list.

**Input_Text**: measuring 9 x 5 mm and previously measuring 8 x 6 mm

**Output**: measuring MEASUREMENT and previously measuring MEASUREMENT

**List**: 9 x 5 mm, 8 x 6 mm

So far my code is below but it only brings the first match:

result = re.search(measurement, Input_Text)
                    if result:
                        Input_Text = Input_Text.replace(result, "MEASUREMENT") 

回答1:


You can use re.sub() for the replacement, and re.findall() to get all matched strings.

measurement = re.compile("(\d+(?:\.\d*)?)\s*x\s*(\d+(?:\.\d*)?)\s*(cm|mm|millimeter|centimeter|millimeters|centimeters)")

text = "measuring 9 x 5 mm and previously measuring 8 x 6 mm"

values = re.findall(pattern=measurement, string=text)

sub_text = re.sub(pattern=measurement, string=text, repl='MEASUREMENT')

>>> sub_text
'measuring MEASUREMENT and previously measuring MEASUREMENT'

>>> values
[('9', '5', 'mm'), ('8', '6', 'mm')]



回答2:


If you don't want to parse your string twice, you can use re.sub with a function as replacement parameter. With this function you can easily populate a list of matching strings.

pat = re.compile(r'\d+(?:\.\d*)?\s*x\s*\d+(?:\.\d*)?\s*(?:cm|mm|millimeters?|centimeters?)')

s = r'measuring 9 x 5 mm and previously measuring 8 x 6 mm'

l = []

def repl(m):
    l.append(m.group(0))
    return 'MEASUREMENT'

s = pat.sub(repl, s)


来源:https://stackoverflow.com/questions/46023527/find-all-matches-of-a-pattern-and-replace-them-in-a-text

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!