How to find Run length encoding in python

前端 未结 3 1903
梦毁少年i
梦毁少年i 2021-01-18 06:42

I have an array ar = [2,2,2,1,1,2,2,3,3,3,3]. For this array, I want to find the lengths of consecutive same numbers like:

 values: 2, 1, 2, 3
l         


        
相关标签:
3条回答
  • 2021-01-18 06:47

    Here is an answer using the high-performance pyrle library for run length arithmetic:

    # pip install pyrle
    # (pyrle >= 0.0.25)
    
    from pyrle import Rle
    
    v = [2,2,2,1,1,2,2,3,3,3,3]
    
    r = Rle(v)
    print(r)
    # +--------+-----+-----+-----+-----+
    # | Runs   | 3   | 2   | 2   | 4   |
    # |--------+-----+-----+-----+-----|
    # | Values | 2   | 1   | 2   | 3   |
    # +--------+-----+-----+-----+-----+
    # Rle of length 11 containing 4 elements
    
    print(r[4])
    # 1.0
    
    print(r[4:7])
    # +--------+-----+-----+
    # | Runs   | 1   | 2   |
    # |--------+-----+-----|
    # | Values | 1.0 | 2.0 |
    # +--------+-----+-----+
    # Rle of length 3 containing 2 elements
    
    r + r + 0.5
    # +--------+-----+-----+-----+-----+
    # | Runs   | 3   | 2   | 2   | 4   |
    # |--------+-----+-----+-----+-----|
    # | Values | 4.5 | 2.5 | 4.5 | 6.5 |
    # +--------+-----+-----+-----+-----+
    # Rle of length 11 containing 4 elements
    
    0 讨论(0)
  • 2021-01-18 07:04

    Here is an answer for pure numpy:

    import numpy as np
    
    
    def find_runs(x):
        """Find runs of consecutive items in an array."""
    
        # ensure array
        x = np.asanyarray(x)
        if x.ndim != 1:
            raise ValueError('only 1D array supported')
        n = x.shape[0]
    
        # handle empty array
        if n == 0:
            return np.array([]), np.array([]), np.array([])
    
        else:
            # find run starts
            loc_run_start = np.empty(n, dtype=bool)
            loc_run_start[0] = True
            np.not_equal(x[:-1], x[1:], out=loc_run_start[1:])
            run_starts = np.nonzero(loc_run_start)[0]
    
            # find run values
            run_values = x[loc_run_start]
    
            # find run lengths
            run_lengths = np.diff(np.append(run_starts, n))
    
            return run_values, run_starts, run_lengths
    

    Credit goes to https://github.com/alimanfoo

    0 讨论(0)
  • 2021-01-18 07:07

    You can do this with groupby

    In [60]: from itertools import groupby
    In [61]: ar = [2,2,2,1,1,2,2,3,3,3,3]
    In [62]: print [(k, sum(1 for i in g)) for k,g in groupby(ar)]
    [(2, 3), (1, 2), (2, 2), (3, 4)]
    
    0 讨论(0)
提交回复
热议问题