Pythonic way to determine whether not null list entries are 'continuous'

后端 未结 11 1198
渐次进展
渐次进展 2021-02-01 01:35

I\'m looking for a way to easily determine if all not None items in a list occur in a single continuous slice. I\'ll use integers as examples of not None items

相关标签:
11条回答
  • 2021-02-01 01:54

    Here's a solution inspired by numpy. Get the array indices of all the non-null elements. Then, compare each index to the one following it. If the difference is greater than one, there are nulls in between the non-nulls. If there are no indices where the following index is more than one greater, then there are no gaps.

    def is_continuous(seq):
        non_null_indices = [i for i, obj in enumerate(seq) if obj is not None]
        for i, index in enumerate(non_null_indices[:-1]):
            if non_null_indices[i+1] - index > 1:
                return False
        return True
    
    0 讨论(0)
  • 2021-02-01 01:58

    This may not be the best way to go about doing it, but you can look for the first non-None entry and the last non-None entry and then check the slice for None. e.g.:

    def is_continuous(seq):
        try:
            first_none_pos = next(i for i,x in enumerate(seq) if x is not None)
            #need the or None on the next line to handle the case where the last index is `None`.
            last_none_pos = -next(i for i,x in enumerate(reversed(seq)) if x is not None) or None
        except StopIteration: #list entirely of `Nones`
            return False
        return None not in seq[first_none_pos:last_none_pos]
    
    assert is_continuous([1,2,3,None,None]) == True
    assert is_continuous([None, 1,2,3,None]) == True
    assert is_continuous([None, None, 1,2,3]) == True
    assert is_continuous([None, 1, None, 2,3]) == False
    assert is_continuous([None, None, 1, None, 2,3]) == False
    assert is_continuous([None, 1, None, 2, None, 3]) == False
    assert is_continuous([1, 2, None, 3, None, None]) == False
    

    This will work for any sequence type.

    0 讨论(0)
  • 2021-02-01 02:01

    You could use something like itertools.groupby:

    from itertools import groupby
    
    def are_continuous(items):
        saw_group = False
    
        for group, values in groupby(items, lambda i: i is not None):
            if group:
                if saw_group:
                    return False
                else:
                    saw_group = True
    
        return True
    

    This will iterate only until it sees a group twice. I'm not sure if you consider [None, None], so tweak it to your needs.

    0 讨论(0)
  • 2021-02-01 02:03

    One liner:

    contiguous = lambda l: ' ' not in ''.join('x '[x is None] for x in l).strip()
    

    The real work is done by the strip function. If there are spaces in a stripped string, then they're not leading/trailing. The rest of the function converts the list to a string, which has a space for each None.

    0 讨论(0)
  • 2021-02-01 02:07

    I did some profiling to compare @gnibbler's approach with the groupby approach. @gnibber's approach is consistently faster, esp. for longer lists. E.g., I see about a 50% performance gain for random inputs with length 3-100, with a 50% chance of containing a single int sequence (randomly selected), and otherwise with random values. Test code below. I interspersed the two methods (randomly selecting which one goes first) to make sure any caching effects get cancelled out. Based on this, I'd say that while the groupby approach is more intuitive, @gnibber's approach may be appropriate if profiling indicates that this is an important part of the overall code to optimize -- in that case, appropriate comments should be used to indicate what's going on with the use of all/any to consumer iterator values.

    from itertools import groupby
    import random, time
    
    def contiguous1(seq):
        # gnibber's approach
        seq = iter(seq)
        all(x is None for x in seq)        # Burn through any Nones at the beginning
        any(x is None for x in seq)        # and the first group
        return all(x is None for x in seq) # everthing else (if any) should be None.
    
    def contiguous2(seq):
        return sum(1 for k,g in groupby(seq, lambda x: x is not None) if k) == 1
    
    times = {'contiguous1':0,'contiguous2':0}
    
    for i in range(400000):
        n = random.randint(3,100)
        items = [None] * n
        if random.randint(0,1):
            s = random.randint(0,n-1)
            e = random.randint(0,n-s)
            for i in range(s,e):
                items[i] = 3
        else:
            for i in range(n):
                if not random.randint(0,2):
                    items[i] = 3
        if random.randint(0,1):
            funcs = [contiguous1, contiguous2]
        else:
            funcs = [contiguous2, contiguous1]
        for func in funcs:
            t0 = time.time()
            func(items)
            times[func.__name__] += (time.time()-t0)
    
    print
    for f,t in times.items():
        print '%10.7f %s' % (t, f)
    
    0 讨论(0)
  • 2021-02-01 02:10

    This algorithm does the work with a few drawbacks (it removes items form the list). But it's a solution.

    Basically if you remove all continuous None from start and the end. And if you found some None in the list then the integers are not in a continuous form.

    def is_continuous(seq):
        while seq and seq[0] is None: del seq[0]
        while seq and seq[-1] is None: del seq[-1]
    
        return None not in seq
    
    assert is_continuous([1,2,3,None,None]) == True
    assert is_continuous([None, 1,2,3,None]) == True
    assert is_continuous([None, None, 1,2,3]) == True
    assert is_continuous([None, 1, None, 2,3]) == False
    assert is_continuous([None, None, 1, None, 2,3]) == False
    assert is_continuous([None, 1, None, 2, None, 3]) == False
    assert is_continuous([1, 2, None, 3, None, None]) == False
    

    Yet, another example of how small code could become evil.

    I wish a strip() method were available for list.

    0 讨论(0)
提交回复
热议问题