Efficient way to find missing elements in an integer sequence

前端 未结 16 1194
鱼传尺愫
鱼传尺愫 2020-12-01 04:32

Suppose we have two items missing in a sequence of consecutive integers and the missing elements lie between the first and last elements. I did write a code that does accomp

16条回答
  •  有刺的猬
    2020-12-01 05:32

    If the input sequence is sorted, you could use sets here. Take the start and end values from the input list:

    def missing_elements(L):
        start, end = L[0], L[-1]
        return sorted(set(range(start, end + 1)).difference(L))
    

    This assumes Python 3; for Python 2, use xrange() to avoid building a list first.

    The sorted() call is optional; without it a set() is returned of the missing values, with it you get a sorted list.

    Demo:

    >>> L = [10,11,13,14,15,16,17,18,20]
    >>> missing_elements(L)
    [12, 19]
    

    Another approach is by detecting gaps between subsequent numbers; using an older itertools library sliding window recipe:

    from itertools import islice, chain
    
    def window(seq, n=2):
        "Returns a sliding window (of width n) over data from the iterable"
        "   s -> (s0,s1,...s[n-1]), (s1,s2,...,sn), ...                   "
        it = iter(seq)
        result = tuple(islice(it, n))
        if len(result) == n:
            yield result    
        for elem in it:
            result = result[1:] + (elem,)
            yield result
    
    def missing_elements(L):
        missing = chain.from_iterable(range(x + 1, y) for x, y in window(L) if (y - x) > 1)
        return list(missing)
    

    This is a pure O(n) operation, and if you know the number of missing items, you can make sure it only produces those and then stops:

    def missing_elements(L, count):
        missing = chain.from_iterable(range(x + 1, y) for x, y in window(L) if (y - x) > 1)
        return list(islice(missing, 0, count))
    

    This will handle larger gaps too; if you are missing 2 items at 11 and 12, it'll still work:

    >>> missing_elements([10, 13, 14, 15], 2)
    [11, 12]
    

    and the above sample only had to iterate over [10, 13] to figure this out.

提交回复
热议问题