Counting jump(no of lines) between first two 'String' occurrences in a file

前端 未结 4 1060
天命终不由人
天命终不由人 2021-01-22 09:27

I have a huge data file with a specific string being repeated after a defined number of lines.

counting jump between first two \'Rank\' occurrences. For example the file

4条回答
  •  野趣味
    野趣味 (楼主)
    2021-01-22 10:07

    Don't use .readlines() when a simple generator expression counting the lines with Rank is enough:

    count = sum(1 for l in open(filename) if 'Rank' not in l)
    

    'Rank' not in l is enough to test if the string 'Rank' is not present in a string. Looping over the open file is looping over all the lines. The sum() function will add up all the 1s, which are generated for each line not containing Rank, giving you a count of lines without Rank in them.

    If you need to count the lines from Rank to Rank, you need a little itertools.takewhile magic:

    import itertools
    with open(filename) as f:
        # skip until we reach `Rank`:
        itertools.takewhile(lambda l: 'Rank' not in l, f)
        # takewhile will have read a line with `Rank` now
        # count the lines *without* `Rank` between them
        count = sum(1 for l in itertools.takewhile(lambda l: 'Rank' not in l, f)
        count += 1  # we skipped at least one `Rank` line.
    

提交回复
热议问题