Remove item from list based on the next item in same list

前端 未结 11 2374
悲&欢浪女
悲&欢浪女 2021-02-18 17:08

I just started learning python and here I have a sorted list of protein sequences (total 59,000 sequences) and some of them overlap. I have made a toy list here for example:

11条回答
  •  遇见更好的自我
    2021-02-18 17:57

    As stated in other answers, your error comes from calculating the length of your input at the start and then not updating it as you shorten the list.

    Here's another take at a working solution:

    with open('toy.txt', 'r') as infile:
        input_lines = reversed(map(lambda s: s.strip(), infile.readlines()))
    
    output = []
    for pattern in input_lines:
        if len(output) == 0 or not output[-1].startswith(pattern):        
            output.append(pattern)
    
    print('\n'.join(reversed(output)))
    

提交回复
热议问题