Splitting large text file by a delimiter in Python

前端 未结 2 366
栀梦
栀梦 2020-12-03 06:07

I imaging this is going to be a simple task but I can\'t find what I am looking for exactly in previous StackOverflow questions to here goes...

I have large text fil

相关标签:
2条回答
  • 2020-12-03 06:41

    If every entry block starts with a colon, you can just split by that:

    with  open('entries.txt') as fp:
        contents = fp.read()
        for entry in contents.split(':'):
            # do something with entry  
    
    0 讨论(0)
  • 2020-12-03 06:45

    You could use itertools.groupby to group lines that occur after :Entry into lists:

    import itertools as it
    filename='test.dat'
    
    with open(filename,'r') as f:
        for key,group in it.groupby(f,lambda line: line.startswith(':Entry')):
            if not key:
                group = list(group)
                print(group)
    

    yields

    ['- Name\n', 'John Doe\n', '\n', '- Date\n', '20/12/1979\n']
    ['\n', '-Name\n', 'Jane Doe\n', '- Date\n', '21/12/1979\n']
    

    Or, to process the groups, you don't really need to convert group to a list:

    with open(filename,'r') as f:
        for key,group in it.groupby(f,lambda line: line.startswith(':Entry')):
            if not key:
                for line in group:
                    ...
    
    0 讨论(0)
提交回复
热议问题