What is the Python equivalent of Perl\'s chomp
function, which removes the last character of a string if it is a newline?
First split lines then join them by any separator you like:
x = ' '.join(x.splitlines())
should work like a charm.
you can use strip:
line = line.strip()
demo:
>>> "\n\n hello world \n\n".strip()
'hello world'
It looks like there is not a perfect analog for perl's chomp. In particular, rstrip cannot handle multi-character newline delimiters like \r\n
. However, splitlines does as pointed out here.
Following my answer on a different question, you can combine join and splitlines to remove/replace all newlines from a string s
:
''.join(s.splitlines())
The following removes exactly one trailing newline (as chomp would, I believe). Passing True
as the keepends
argument to splitlines retain the delimiters. Then, splitlines is called again to remove the delimiters on just the last "line":
def chomp(s):
if len(s):
lines = s.splitlines(True)
last = lines.pop()
return ''.join(lines + last.splitlines())
else:
return ''
import re
r_unwanted = re.compile("[\n\t\r]")
r_unwanted.sub("", your_text)
The canonical way to strip end-of-line (EOL) characters is to use the string rstrip() method removing any trailing \r or \n. Here are examples for Mac, Windows, and Unix EOL characters.
>>> 'Mac EOL\r'.rstrip('\r\n')
'Mac EOL'
>>> 'Windows EOL\r\n'.rstrip('\r\n')
'Windows EOL'
>>> 'Unix EOL\n'.rstrip('\r\n')
'Unix EOL'
Using '\r\n' as the parameter to rstrip means that it will strip out any trailing combination of '\r' or '\n'. That's why it works in all three cases above.
This nuance matters in rare cases. For example, I once had to process a text file which contained an HL7 message. The HL7 standard requires a trailing '\r' as its EOL character. The Windows machine on which I was using this message had appended its own '\r\n' EOL character. Therefore, the end of each line looked like '\r\r\n'. Using rstrip('\r\n') would have taken off the entire '\r\r\n' which is not what I wanted. In that case, I simply sliced off the last two characters instead.
Note that unlike Perl's chomp
function, this will strip all specified characters at the end of the string, not just one:
>>> "Hello\n\n\n".rstrip("\n")
"Hello"
If your question is to clean up all the line breaks in a multiple line str object (oldstr), you can split it into a list according to the delimiter '\n' and then join this list into a new str(newstr).
newstr = "".join(oldstr.split('\n'))