Best way of removing single newlines but keeping multiple newlines

我们两清 提交于 2021-01-04 10:46:12

问题


What would be the most pythonic way of removing single newlines but keeping multiple newlines from a string?

As in

"foo\n\nbar\none\n\rtwo\rthree\n\n\nhello"

turning into

"foo\n\nbar one two three\n\n\nhello"

I was thinking about using splitlines(), then replacing empty lines by "\n" and then concatenating everything back again, but I suspect there is a better/simpler way. Maybe using regexes?


回答1:


>>> re.sub('(?<![\r\n])(\r?\n|\n?\r)(?![\r\n])', ' ', s)
'foo\n\nbar one two three\n\n\nhello'

This looks for \r?\n or \n?\r and uses lookbehind and lookahead assertions to prevent there from being a newline on either side.

For what it's worth, there are three types of line endings found in the wild:

  1. \n on Linux, Mac OS X, and other Unices
  2. \r\n on Windows, and in the HTTP protocol
  3. \r on Mac OS 9 and earlier

The first two are by far the most common. If you want to limit the possibilities to just those three, you could do:

>>> re.sub('(?<![\r\n])(\r?\n|\r)(?![\r\n])', ' ', s)
'foo\n\nbar one two three\n\n\nhello'

And of course, get rid of the |\r if you don't care about Mac line endings, which are rare.



来源:https://stackoverflow.com/questions/22649851/best-way-of-removing-single-newlines-but-keeping-multiple-newlines

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!