问题
What would be the most pythonic way of removing single newlines but keeping multiple newlines from a string?
As in
"foo\n\nbar\none\n\rtwo\rthree\n\n\nhello"
turning into
"foo\n\nbar one two three\n\n\nhello"
I was thinking about using splitlines(), then replacing empty lines by "\n"
and then concatenating everything back again, but I suspect there is a better/simpler way. Maybe using regexes?
回答1:
>>> re.sub('(?<![\r\n])(\r?\n|\n?\r)(?![\r\n])', ' ', s)
'foo\n\nbar one two three\n\n\nhello'
This looks for \r?\n
or \n?\r
and uses lookbehind and lookahead assertions to prevent there from being a newline on either side.
For what it's worth, there are three types of line endings found in the wild:
\n
on Linux, Mac OS X, and other Unices\r\n
on Windows, and in the HTTP protocol\r
on Mac OS 9 and earlier
The first two are by far the most common. If you want to limit the possibilities to just those three, you could do:
>>> re.sub('(?<![\r\n])(\r?\n|\r)(?![\r\n])', ' ', s)
'foo\n\nbar one two three\n\n\nhello'
And of course, get rid of the |\r
if you don't care about Mac line endings, which are rare.
来源:https://stackoverflow.com/questions/22649851/best-way-of-removing-single-newlines-but-keeping-multiple-newlines