How can I remove a trailing newline?

前端 未结 28 3508
感动是毒
感动是毒 2020-11-21 23:27

What is the Python equivalent of Perl\'s chomp function, which removes the last character of a string if it is a newline?

相关标签:
28条回答
  • 2020-11-21 23:53
    s = '''Hello  World \t\n\r\tHi There'''
    # import the module string   
    import string
    # use the method translate to convert 
    s.translate({ord(c): None for c in string.whitespace}
    >>'HelloWorldHiThere'
    

    With regex

    s = '''  Hello  World 
    \t\n\r\tHi '''
    print(re.sub(r"\s+", "", s), sep='')  # \s matches all white spaces
    >HelloWorldHi
    

    Replace \n,\t,\r

    s.replace('\n', '').replace('\t','').replace('\r','')
    >'  Hello  World Hi '
    

    With regex

    s = '''Hello  World \t\n\r\tHi There'''
    regex = re.compile(r'[\n\r\t]')
    regex.sub("", s)
    >'Hello  World Hi There'
    

    with Join

    s = '''Hello  World \t\n\r\tHi There'''
    ' '.join(s.split())
    >'Hello  World Hi There'
    
    0 讨论(0)
  • 2020-11-21 23:54

    If you are concerned about speed (say you have a looong list of strings) and you know the nature of the newline char, string slicing is actually faster than rstrip. A little test to illustrate this:

    import time
    
    loops = 50000000
    
    def method1(loops=loops):
        test_string = 'num\n'
        t0 = time.time()
        for num in xrange(loops):
            out_sting = test_string[:-1]
        t1 = time.time()
        print('Method 1: ' + str(t1 - t0))
    
    def method2(loops=loops):
        test_string = 'num\n'
        t0 = time.time()
        for num in xrange(loops):
            out_sting = test_string.rstrip()
        t1 = time.time()
        print('Method 2: ' + str(t1 - t0))
    
    method1()
    method2()
    

    Output:

    Method 1: 3.92700004578
    Method 2: 6.73000001907
    
    0 讨论(0)
  • 2020-11-21 23:55
    "line 1\nline 2\r\n...".replace('\n', '').replace('\r', '')
    >>> 'line 1line 2...'
    

    or you could always get geekier with regexps :)

    have fun!

    0 讨论(0)
  • 2020-11-21 23:55

    I'm bubbling up my regular expression based answer from one I posted earlier in the comments of another answer. I think using re is a clearer more explicit solution to this problem than str.rstrip.

    >>> import re
    

    If you want to remove one or more trailing newline chars:

    >>> re.sub(r'[\n\r]+$', '', '\nx\r\n')
    '\nx'
    

    If you want to remove newline chars everywhere (not just trailing):

    >>> re.sub(r'[\n\r]+', '', '\nx\r\n')
    'x'
    

    If you want to remove only 1-2 trailing newline chars (i.e., \r, \n, \r\n, \n\r, \r\r, \n\n)

    >>> re.sub(r'[\n\r]{1,2}$', '', '\nx\r\n\r\n')
    '\nx\r'
    >>> re.sub(r'[\n\r]{1,2}$', '', '\nx\r\n\r')
    '\nx\r'
    >>> re.sub(r'[\n\r]{1,2}$', '', '\nx\r\n')
    '\nx'
    

    I have a feeling what most people really want here, is to remove just one occurrence of a trailing newline character, either \r\n or \n and nothing more.

    >>> re.sub(r'(?:\r\n|\n)$', '', '\nx\n\n', count=1)
    '\nx\n'
    >>> re.sub(r'(?:\r\n|\n)$', '', '\nx\r\n\r\n', count=1)
    '\nx\r\n'
    >>> re.sub(r'(?:\r\n|\n)$', '', '\nx\r\n', count=1)
    '\nx'
    >>> re.sub(r'(?:\r\n|\n)$', '', '\nx\n', count=1)
    '\nx'
    

    (The ?: is to create a non-capturing group.)

    (By the way this is not what '...'.rstrip('\n', '').rstrip('\r', '') does which may not be clear to others stumbling upon this thread. str.rstrip strips as many of the trailing characters as possible, so a string like foo\n\n\n would result in a false positive of foo whereas you may have wanted to preserve the other newlines after stripping a single trailing one.)

    0 讨论(0)
  • 2020-11-21 23:58

    And I would say the "pythonic" way to get lines without trailing newline characters is splitlines().

    >>> text = "line 1\nline 2\r\nline 3\nline 4"
    >>> text.splitlines()
    ['line 1', 'line 2', 'line 3', 'line 4']
    
    0 讨论(0)
  • 2020-11-21 23:58

    I find it convenient to have be able to get the chomped lines via in iterator, parallel to the way you can get the un-chomped lines from a file object. You can do so with the following code:

    def chomped_lines(it):
        return map(operator.methodcaller('rstrip', '\r\n'), it)
    

    Sample usage:

    with open("file.txt") as infile:
        for line in chomped_lines(infile):
            process(line)
    
    0 讨论(0)
提交回复
热议问题