Splitting textfile into section with special delimiter line - python

前端 未结 3 1743
说谎
说谎 2020-12-19 14:58

I have an input file as such:

This is a text block start
This is the end

And this is another
with more than one line
and another line.

The

相关标签:
3条回答
  • 2020-12-19 15:15

    Simply do this:

    with open('yorfileaname.txt') as f: #open desired file
        data = f.read() #read the whole file and save to variable data
        print(*(data.split('=========='))) #now split data when "=.." and print it 
        #usually it would ouput a list but if you use * it will print as string
    

    Output:

    content content
    more content
    content conclusion
    
    content again
    more of it
    content conclusion
    
    content
    content
    contend done
    
    0 讨论(0)
  • 2020-12-19 15:32

    How about something like this?

    from itertools import groupby
    
    def per_section(s, delimiters=()):
        def key(s):
            return not s or s.isspace() or any(s.startswith(x) for x in delimiters)
        for k, g in groupby(s.splitlines(), key=key):
            if not k:
                yield list(g)
    
    
    if __name__ == '__main__':
        print list(per_section('''This is a text block start
    This is the end
    
    And this is another
    with more than one line
    and another line.'''))
    
        print list(per_section('''# Some comments, maybe the title of the following section
    This is a text block start
    This is the end
    # Some other comments and also the title
    And this is another
    with more than one line
    and another line.''', ('#')))
    
    print list(per_section('''!! Some comments, maybe the title of the following section
    This is a text block start
    This is the end
    $$ Some other comments and also the title
    And this is another
    with more than one line
    and another line.''', ('!', '$')))    
    

    Output:

    [['This is a text block start', 'This is the end'], ['And this is another', 'with more than one line', 'and another line.']]
    [['This is a text block start', 'This is the end'], ['And this is another', 'with more than one line', 'and another line.']]
    [['This is a text block start', 'This is the end'], ['And this is another', 'with more than one line', 'and another line.']]
    
    0 讨论(0)
  • 2020-12-19 15:34

    How about pass a predicate?

    def per_section(it, is_delimiter=lambda x: x.isspace()):
        ret = []
        for line in it:
            if is_delimiter(line):
                if ret:
                    yield ret  # OR  ''.join(ret)
                    ret = []
            else:
                ret.append(line.rstrip())  # OR  ret.append(line)
        if ret:
            yield ret
    

    Usage:

    with open('/path/to/file.txt') as f:
        sections = list(per_section(f))  # default delimiter
    
    with open('/path/to/file.txt.txt') as f:
        sections = list(per_section(f, lambda line: line.startswith('#'))) # comment
    
    0 讨论(0)
提交回复
热议问题