Python “split” on empty new line

前端 未结 3 1365
野的像风
野的像风 2020-12-16 03:54

Trying to use a python split on a \"empty\" newline but not any other new lines. I tried a few other example I found but none of them seem to work.

Data example:

相关标签:
3条回答
  • 2020-12-16 04:13

    This works in the case where multiple blank lines should be treated as one.

    import re
    
    def split_on_empty_lines(s):
    
        # greedily match 2 or more new-lines
        blank_line_regex = r"(?:\r?\n){2,}"
    
        return re.split(blank_line_regex, s.strip())
    

    The regex is a bit odd.

    1. Firstly, the greedy matching means that many blank lines count as a single match, i.e. 6 blank lines makes one split, not three splits.
    2. Secondly, the pattern doesn't just match \n but either \r\n (for Windows) or \n (for Linux/Mac).
    3. Thirdly, the group (denoted by parentheses) needs to have ?: inside the
      opening parenthesis to make it a "non-capturing" group, which changes the behaviour of re.split.

    For example:

    s = """
    
    hello
    world
    
    this is
    
    
    
    
    
    
    
    a test
    
    """
    
    split_on_empty_lines(s)
    

    returns

    ['hello\nworld', 'this is', 'a test']
    
    0 讨论(0)
  • 2020-12-16 04:24

    A blank line is just two new lines. So your easiest solution is probably to check for two new lines (UNLESS you expect to have a situation where you'll have more than two blank lines in a row).

    import os
    myarray = [] #As DeepSpace notes, this is not necessary as split will return a list. No impact to later code, just more typing
    myarray = output.split(os.linesep + os.linesep) ##use os.linesep to make this compatible on more systems
    

    That would be where I'd start anyway

    0 讨论(0)
  • 2020-12-16 04:25

    It's quite easy when you consider what is on empty line. It's just the the newline character, so splitting on empty line would be splitting on two newline characters in sequence (one from the previou non-empty line, one is the 'whole' empty line.

    myarray = output.split("\n\n")
    for line in myarray:
        print line
        print "Next Line"
    
    0 讨论(0)
提交回复
热议问题