I have an array ar = [2,2,2,1,1,2,2,3,3,3,3]
.
For this array, I want to find the lengths of consecutive same numbers like:
values: 2, 1, 2, 3
l
Here is an answer using the high-performance pyrle library for run length arithmetic:
# pip install pyrle
# (pyrle >= 0.0.25)
from pyrle import Rle
v = [2,2,2,1,1,2,2,3,3,3,3]
r = Rle(v)
print(r)
# +--------+-----+-----+-----+-----+
# | Runs | 3 | 2 | 2 | 4 |
# |--------+-----+-----+-----+-----|
# | Values | 2 | 1 | 2 | 3 |
# +--------+-----+-----+-----+-----+
# Rle of length 11 containing 4 elements
print(r[4])
# 1.0
print(r[4:7])
# +--------+-----+-----+
# | Runs | 1 | 2 |
# |--------+-----+-----|
# | Values | 1.0 | 2.0 |
# +--------+-----+-----+
# Rle of length 3 containing 2 elements
r + r + 0.5
# +--------+-----+-----+-----+-----+
# | Runs | 3 | 2 | 2 | 4 |
# |--------+-----+-----+-----+-----|
# | Values | 4.5 | 2.5 | 4.5 | 6.5 |
# +--------+-----+-----+-----+-----+
# Rle of length 11 containing 4 elements
Here is an answer for pure numpy:
import numpy as np
def find_runs(x):
"""Find runs of consecutive items in an array."""
# ensure array
x = np.asanyarray(x)
if x.ndim != 1:
raise ValueError('only 1D array supported')
n = x.shape[0]
# handle empty array
if n == 0:
return np.array([]), np.array([]), np.array([])
else:
# find run starts
loc_run_start = np.empty(n, dtype=bool)
loc_run_start[0] = True
np.not_equal(x[:-1], x[1:], out=loc_run_start[1:])
run_starts = np.nonzero(loc_run_start)[0]
# find run values
run_values = x[loc_run_start]
# find run lengths
run_lengths = np.diff(np.append(run_starts, n))
return run_values, run_starts, run_lengths
Credit goes to https://github.com/alimanfoo
You can do this with groupby
In [60]: from itertools import groupby
In [61]: ar = [2,2,2,1,1,2,2,3,3,3,3]
In [62]: print [(k, sum(1 for i in g)) for k,g in groupby(ar)]
[(2, 3), (1, 2), (2, 2), (3, 4)]