I have got a sequence of strings - 0000001, 0000002, 0000003....
upto 2 million. They are not contiguous. Meaning there are gaps. Say after 0000003 the next str
You could sort the list of ids and then step through it once only:
def find_gaps(ids):
"""Generate the gaps in the list of ids."""
j = 1
for id_i in sorted(ids):
while True:
id_j = '%07d' % j
j += 1
if id_j >= id_i:
break
yield id_j
>>> list(find_gaps(["0000001", "0000003", "0000006"]))
['0000002', '0000004', '0000005']
If the input list is already in order, then you can avoid the sorted
(though it does little harm: Python's adaptive mergesort is O(n) if the list is already sorted).