Remove periods at the end of sentences in python

浪子不回头ぞ 提交于 2019-12-13 16:31:46

问题


I have sentences like this - "this is a test. 4.55 and 5,000." I want to remove the period at the end of the sentences, but not between numbers. My output has to be - "this is a test 4.55 and 5,000" I tried the below options, but not getting the required output:

wordList = "this is a test. 4.55 and 5,000."
pattern3 = re.compile("[^\w\d]+")
wordList = pattern3.sub(' ',wordList)

Also tried the below 2:

pattern3 = re.compile("[^\w]|^[0-9]\.[0-9]")
pattern3 = re.compile("[^\w]|^([0-9]/.[0-9]+)")

I don't know where I am going wrong. Can someone give me some pointers? I searched the earlier posts and tried them, but they are not working for my situation.


回答1:


Try a negative lookahead:

\.(?!\d)

What this matches is any period that's not followed by a digit.




回答2:


In regex, the $ special character "[matches] the end of the string or just before the newline at the end of the string"

In that case, assuming only one sentence per line, I would suggest the following:

\.$

This will match only periods that occur at the end of a string (or end of a line for multiline strings). Of course, if you cannot guarantee one sentence per line then they isn't the solution for you.




回答3:


How about

pattern = re.compile(r'\.(\s)')
wordList = pattern.sub(r'\1', wordList)

This replaces a period followed by a space with just the space.



来源:https://stackoverflow.com/questions/12448401/remove-periods-at-the-end-of-sentences-in-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!