python: vectorized cumulative counting

前端 未结 1 1061
被撕碎了的回忆
被撕碎了的回忆 2021-01-16 02:44

I have a numpy array and would like to count the number of occurences for each value, however, in a cumulative way

in  = [0, 1, 0, 1, 2, 3, 0, 0, 2, 1, 1, 3,         


        
相关标签:
1条回答
  • 2021-01-16 03:30

    Here's one vectorized approach using sorting -

    def cumcount(a):
        # Store length of array
        n = len(a)
    
        # Get sorted indices (use later on too) and store the sorted array
        sidx = a.argsort()
        b = a[sidx]
    
        # Mask of shifts/groups
        m = b[1:] != b[:-1]
    
        # Get indices of those shifts
        idx = np.flatnonzero(m)
    
        # ID array that will store the cumulative nature at the very end
        id_arr = np.ones(n,dtype=int)
        id_arr[idx[1:]+1] = -np.diff(idx)+1
        id_arr[idx[0]+1] = -idx[0]
        id_arr[0] = 0
        c = id_arr.cumsum()
    
        # Finally re-arrange those cumulative values back to original order
        out = np.empty(n, dtype=int)
        out[sidx] = c
        return out
    

    Sample run -

    In [66]: a
    Out[66]: array([0, 1, 0, 1, 2, 3, 0, 0, 2, 1, 1, 3, 3, 0])
    
    In [67]: cumcount(a)
    Out[67]: array([0, 0, 1, 1, 0, 0, 2, 3, 1, 2, 3, 1, 2, 4])
    
    0 讨论(0)
提交回复
热议问题