Efficient way to find missing elements in an integer sequence

前端 未结 16 1197
鱼传尺愫
鱼传尺愫 2020-12-01 04:32

Suppose we have two items missing in a sequence of consecutive integers and the missing elements lie between the first and last elements. I did write a code that does accomp

相关标签:
16条回答
  • 2020-12-01 05:30
    >>> l = [10,11,13,14,15,16,17,18,20]
    >>> [l[i]+1 for i, j in enumerate(l) if (l+[0])[i+1] - l[i] > 1]
    [12, 19]
    
    0 讨论(0)
  • 2020-12-01 05:31

    Using collections.Counter:

    from collections import Counter
    
    dic = Counter([10, 11, 13, 14, 15, 16, 17, 18, 20])
    print([i for i in range(10, 20) if dic[i] == 0])
    

    Output:

    [12, 19]
    
    0 讨论(0)
  • 2020-12-01 05:31

    Here's a one-liner:

    In [10]: l = [10,11,13,14,15,16,17,18,20]
    
    In [11]: [i for i, (n1, n2) in enumerate(zip(l[:-1], l[1:])) if n1 + 1 != n2]
    Out[11]: [1, 7]
    

    I use the list, slicing to offset the copies by one, and use enumerate to get the indices of the missing item.

    For long lists, this isn't great because it's not O(log(n)), but I think it should be pretty efficient versus using a set for small inputs. izip from itertools would probably make it quicker still.

    0 讨论(0)
  • 2020-12-01 05:32

    If the input sequence is sorted, you could use sets here. Take the start and end values from the input list:

    def missing_elements(L):
        start, end = L[0], L[-1]
        return sorted(set(range(start, end + 1)).difference(L))
    

    This assumes Python 3; for Python 2, use xrange() to avoid building a list first.

    The sorted() call is optional; without it a set() is returned of the missing values, with it you get a sorted list.

    Demo:

    >>> L = [10,11,13,14,15,16,17,18,20]
    >>> missing_elements(L)
    [12, 19]
    

    Another approach is by detecting gaps between subsequent numbers; using an older itertools library sliding window recipe:

    from itertools import islice, chain
    
    def window(seq, n=2):
        "Returns a sliding window (of width n) over data from the iterable"
        "   s -> (s0,s1,...s[n-1]), (s1,s2,...,sn), ...                   "
        it = iter(seq)
        result = tuple(islice(it, n))
        if len(result) == n:
            yield result    
        for elem in it:
            result = result[1:] + (elem,)
            yield result
    
    def missing_elements(L):
        missing = chain.from_iterable(range(x + 1, y) for x, y in window(L) if (y - x) > 1)
        return list(missing)
    

    This is a pure O(n) operation, and if you know the number of missing items, you can make sure it only produces those and then stops:

    def missing_elements(L, count):
        missing = chain.from_iterable(range(x + 1, y) for x, y in window(L) if (y - x) > 1)
        return list(islice(missing, 0, count))
    

    This will handle larger gaps too; if you are missing 2 items at 11 and 12, it'll still work:

    >>> missing_elements([10, 13, 14, 15], 2)
    [11, 12]
    

    and the above sample only had to iterate over [10, 13] to figure this out.

    0 讨论(0)
提交回复
热议问题