Numpy: find indeces of mask edges

后端 未结 2 1696
清酒与你
清酒与你 2020-12-20 17:48

I\'m trying to find indeces of masked segments. For example:

mask = [1, 0, 0, 1, 1, 1, 0, 0]
segments = [(0, 0), (3, 5)]

Current solution l

相关标签:
2条回答
  • 2020-12-20 17:48

    An alternative approach with np.ma.clump_masked.

    mask = np.array([1, 0, 0, 1, 1, 1, 0, 0])
    # get a list of "clumps" or contiguous slices.
    slices = np.ma.clump_masked(np.ma.masked_where(mask, mask))
    # convert each slice to a tuple of indices.
    result = [(s.start, s.stop - 1) for s in slices]
    # [(0, 0), (3, 5)]  
    
    0 讨论(0)
  • 2020-12-20 17:52

    Here's one approach -

    def start_stop(a, trigger_val):
        # "Enclose" mask with sentients to catch shifts later on
        mask = np.r_[False,np.equal(a, trigger_val),False]
    
        # Get the shifting indices
        idx = np.flatnonzero(mask[1:] != mask[:-1])
    
        # Get the start and end indices with slicing along the shifting ones
        return zip(idx[::2], idx[1::2]-1)
    

    Sample run -

    In [216]: mask = [1, 0, 0, 1, 1, 1, 0, 0]
    
    In [217]: start_stop(mask, trigger_val=1)
    Out[217]: [(0, 0), (3, 5)]
    

    Use it to get the edges for 0s -

    In [218]: start_stop(mask, trigger_val=0)
    Out[218]: [(1, 2), (6, 7)]
    

    Timings on 100000x scaled up datasize -

    In [226]: mask = [1, 0, 0, 1, 1, 1, 0, 0]
    
    In [227]: mask = np.repeat(mask,100000)
    
    # Original soln
    In [230]: %%timeit
         ...: segments = []
         ...: start = 0
         ...: for i in range(len(mask) - 1):
         ...:     e1 = mask[i]
         ...:     e2 = mask[i + 1]
         ...:     if e1 == 0 and e2 == 1:
         ...:         start = i + 1
         ...:     elif e1 == 1 and e2 == 0:
         ...:         segments.append((start, i))
    1 loop, best of 3: 401 ms per loop
    
    # @Yakym Pirozhenko's soln
    In [231]: %%timeit
         ...: slices = np.ma.clump_masked(np.ma.masked_where(mask, mask))
         ...: result = [(s.start, s.stop - 1) for s in slices]
    100 loops, best of 3: 4.8 ms per loop
    
    In [232]: %timeit start_stop(mask, trigger_val=1)
    1000 loops, best of 3: 1.41 ms per loop
    
    0 讨论(0)
提交回复
热议问题