Remove item from list based on the next item in same list

前端未结

关注

 11  2401

悲&欢浪女 2021-02-18 17:08

I just started learning python and here I have a sorted list of protein sequences (total 59,000 sequences) and some of them overlap. I have made a toy list here for example:

11条回答

忘掉有多难 (楼主)

2021-02-18 17:45

This will get you where you want to be:

with open('toy.txt' ,'r') as f:
    lines = f.readlines()
    data = set(lines)
    print(sorted([i for i in lines if len([j for j in data if j.startswith(i)])==1]))

#['ABCDEFGHIJKLMNO', 'CEST', 'DBTSFDEO', 'EAEUDNBNUW', 'EOEUDNBNUW', 'FGH']

I've added set just in case of multiple occurrences of same text.

0 讨论(0)

查看其它11个回答