问题

The aim is to find groups of increasing/monotonic numbers given a list of integers. Each item in the resulting group must be of a +1 increment from the previous item

Given an input:

x = [7, 8, 9, 10, 6, 0, 1, 2, 3, 4, 5]

I need to find groups of increasing numbers and achieve:

increasing_numbers = [(7,8,9,10), (0,1,2,3,4,5)]

And eventually also the number of increasing numbers:

len(list(chain(*increasing_numbers)))

And also the len of the groups:

increasing_num_groups_length = [len(i) for i in increasing_numbers]

I have tried the following to get the number of increasing numbers:

>>> from itertools import tee, chain
>>> def pairwise(iterable): 
...     a, b = tee(iterable)
...     next(b, None)
...     return zip(a, b)
... 
>>> x = [8, 9, 10, 11, 7, 1, 2, 3, 4, 5, 6]
>>> set(list(chain(*[(i,j) for i,j in pairwise(x) if j-1==i])))
set([1, 2, 3, 4, 5, 6, 8, 9, 10, 11])
>>> len(set(list(chain(*[(i,j) for i,j in pairwise(x) if j-1==i]))))
10

But I'm unable to keep the order and the groups of increasing numbers.

How can I achieve the increasing_numbers groups of integer tuples and also the increasing_num_groups_length?

Also, is there a name for such/similar problem?

EDITED

I've came up with this solution but it's super verbose and I'm sure there's an easier way to achieve the increasing_numbers output:

>>> from itertools import tee, chain
>>> def pairwise(iterable): 
...     a, b = tee(iterable)
...     next(b, None)
...     return zip(a, b)
... 
>>> x = [8, 9, 10, 11, 7, 1, 2, 3, 4, 5, 6]
>>> boundary =  iter([0] + [i+1 for i, (j,k) in enumerate(pairwise(x)) if j+1!=k] + [len(x)])
>>> [tuple(x[i:next(boundary)]) for i in boundary]
[(8, 9, 10, 11), (1, 2, 3, 4, 5, 6)]

Is there a more pythonic / less verbose way to do this?

Another input/output example:

[in]:

[17, 17, 19, 20, 21, 22, 0, 1, 2, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 14, 14, 28, 29, 30, 31, 32, 33, 34, 35, 36, 40]

[out]:

[(19, 20, 21, 22), (0, 1, 2), (4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14), (28, 29, 30, 31, 32, 33, 34, 35, 36)]

回答1:

A couple of different ways using itertools and numpy:

from itertools import groupby, tee, cycle

x = [17, 17, 19, 20, 21, 22, 0, 1, 2, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 14, 14, 28, 29, 30, 31, 32, 33, 34, 35,
     36, 1, 2, 3, 4,34,54]


def sequences(l):
    x2 = cycle(l)
    next(x2)
    grps = groupby(l, key=lambda j: j + 1 == next(x2))
    for k, v in grps:
        if k:
            yield tuple(v) + (next((next(grps)[1])),)


print(list(sequences(x)))

[(19, 20, 21, 22), (0, 1, 2), (4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14), (28, 29, 30, 31, 32, 33, 34, 35, 36), (1, 2, 3, 4)]

Or using python3 and yield from:

def sequences(l):
    x2 = cycle(l)
    next(x2)
    grps = groupby(l, key=lambda j: j + 1 == next(x2))
    yield from (tuple(v) + (next((next(grps)[1])),) for k,v in grps if k)

print(list(sequences(x)))

Using a variation of my answer here with numpy.split :

out = [tuple(arr) for arr in np.split(x, np.where(np.diff(x) != 1)[0] + 1) if arr.size > 1]

print(out)

[(19, 20, 21, 22), (0, 1, 2), (4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14), (28, 29, 30, 31, 32, 33, 34, 35, 36), (1, 2, 3, 4)]

And similar to ekhumoro's answer:

def sequences(x):
    it = iter(x)
    prev, temp = next(it), []
    while prev is not None:
        start = next(it, None)
        if prev + 1 == start:
            temp.append(prev)
        elif temp:
            yield tuple(temp + [prev])
            temp = []
        prev = start

To get the length and the tuple:

def sequences(l):
    x2 = cycle(l)
    next(x2)
    grps = groupby(l, key=lambda j: j + 1 == next(x2))
    for k, v in grps:
        if k:
            t = tuple(v) + (next(next(grps)[1]),)
            yield t, len(t)


def sequences(l):
    x2 = cycle(l)
    next(x2)
    grps = groupby(l, lambda j: j + 1 == next(x2))
    yield from ((t, len(t)) for t in (tuple(v) + (next(next(grps)[1]),)
                                      for k, v in grps if k))



def sequences(x):
        it = iter(x)
        prev, temp = next(it), []
        while prev is not None:
            start = next(it, None)
            if prev + 1 == start:
                temp.append(prev)
            elif temp:
                yield tuple(temp + [prev]), len(temp) + 1
                temp = []
            prev = start

Output will be the same for all three:

[((19, 20, 21, 22), 4), ((0, 1, 2), 3), ((4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14), 11)
, ((28, 29, 30, 31, 32, 33, 34, 35, 36), 9), ((1, 2, 3, 4), 4)]

回答2:

EDIT:

Here's a code-golf solution (142 characters):

def f(x):s=[0]+[i for i in range(1,len(x)) if x[i]!=x[i-1]+1]+[len(x)];return [x[j:k] for j,k in [s[i:i+2] for i in range(len(s)-1)] if k-j>1]

Expanded version:

def igroups(x):
    s = [0] + [i for i in range(1, len(x)) if x[i] != x[i-1] + 1] + [len(x)]
    return [x[j:k] for j, k in [s[i:i+2] for i in range(len(s)-1)] if k - j > 1]

Commented version:

def igroups(x):
    # find the boundaries where numbers are not consecutive
    boundaries = [i for i in range(1, len(x)) if x[i] != x[i-1] + 1]
    # add the start and end boundaries
    boundaries = [0] + boundaries + [len(x)]
    # take the boundaries as pairwise slices
    slices = [boundaries[i:i + 2] for i in range(len(boundaries) - 1)]
    # extract all sequences with length greater than one
    return [x[start:end] for start, end in slices if end - start > 1]

Original solution:

Not sure whether this counts as "pythonic" or "not too verbose":

def igroups(iterable):
    items = iter(iterable)
    a, b = None, next(items, None)
    result = [b]
    while b is not None:
        a, b = b, next(items, None)
        if b is not None and a + 1 == b:
            result.append(b)
        else:
            if len(result) > 1:
                yield tuple(result)
            result = [b]

print(list(igroups([])))
print(list(igroups([0, 0, 0])))
print(list(igroups([7, 8, 9, 10, 6, 0, 1, 2, 3, 4, 5])))
print(list(igroups([8, 9, 10, 11, 7, 1, 2, 3, 4, 5, 6])))
print(list(igroups([9, 1, 2, 3, 1, 1, 2, 3, 5])))

Output:

[]
[]
[(7, 8, 9, 10), (0, 1, 2, 3, 4, 5)]
[(8, 9, 10, 11), (1, 2, 3, 4, 5, 6)]
[(1, 2, 3), (1, 2, 3)]

回答3:

I think the most maintainable solution would be to make it simple:

def group_by(l):
    res = [[l[0]]]
    for i in range(1, len(l)):
        if l[i-1] < l[i]:
            res[-1].append(l[i])
        else:
            res.append([l[i]])
    return res

This solution does not filter out single element sequences, but it can be easily implemented. Additionally, this has O(n) complexity. And you can make it an generator as well if you want.

By maintainable I mean code that is not an one-liner of 300 characters, with some convoluted expressions. Then maybe you would want to use Perl :). At least you will how the function behaves one year later.

>>> x = [7, 8, 9, 10, 6, 0, 1, 2, 3, 4, 5]
>>> print(group_by(x))
[[7, 8, 9, 10], [6], [0, 1, 2, 3, 4, 5]]

回答4:

If two consecutive numbers are increasing by one I form a list (group) of tuples of those numbers.

When non-increasing and if the list (group) is non-empty, I unpack it and zip again to rebuild the pair of sequence which were broken by the zip. I use set comprehension to eliminate duplicate numbers.

  def extract_increasing_groups(seq):
    seq = tuple(seq)

    def is_increasing(a,b):
        return a + 1 == b

    def unzip(seq):
        return tuple(sorted({ y for x in zip(*seq) for y in x}))

    group = []
    for a,b in zip(seq[:-1],seq[1:]):
        if is_increasing(a,b):
            group.append((a,b))
        elif group:
            yield unzip(group)
            group = []

    if group:
        yield unzip(group)

if __name__ == '__main__':

    x = [17, 17, 19, 20, 21, 22, 0, 1, 2, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12,

         13, 14, 14, 14, 28, 29, 30, 31, 32, 33, 34, 35, 36, 40]

    for group in extract_increasing_groups(x):
        print(group)

Simpler one using set;

from collections import namedtuple
from itertools import islice, tee

def extract_increasing_groups(iterable):

    iter1, iter2 = tee(iterable)
    iter2 = islice(iter2,1,None)

    is_increasing = lambda a,b: a + 1 == b
    Igroup = namedtuple('Igroup','group, len')

    group = set()
    for pair in zip(iter1, iter2):
        if is_increasing(*pair):
            group.update(pair)
        elif group:
            yield Igroup(tuple(sorted(group)),len(group))
            group = set()

    if group:
        yield Igroup(tuple(sorted(group)), len(group))


if __name__ == '__main__':

    x = [17, 17, 19, 20, 21, 22, 0, 1, 2, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 14, 14, 28, 29, 30, 31, 32, 33, 34, 35, 36, 40]
    total = 0
    for group in extract_increasing_groups(x):
        total += group.len
        print('Group: {}\nLength: {}'.format(group.group, group.len))
    print('Total: {}'.format(total))

回答5:

def igroups(L):
    R=[[]]
    [R[-1].append(L[i]) for i in range(len(L)) if (L[i-1]+1==L[i] if L[i-1]+1==L[i] else R.append([L[i]]))]
    return [P for P in R if len(P)>1]


tests=[[],
    [0, 0, 0],
    [7, 8, 9, 10, 6, 0, 1, 2, 3, 4, 5],
    [8, 9, 10, 11, 7, 1, 2, 3, 4, 5, 6],
    [9, 1, 2, 3, 1, 1, 2, 3, 5],
    [4,3,2,1,1,2,3,3,4,3],
    [1, 4, 3],
    [1],
    [1,2],
    [2,1]
    ]
for L in tests:
    print(L)
    print(igroups(L))
    print("-"*10)

outputting the following:

[]
[]
----------
[0, 0, 0]
[]
----------
[7, 8, 9, 10, 6, 0, 1, 2, 3, 4, 5]
[[7, 8, 9, 10], [0, 1, 2, 3, 4, 5]]
----------
[8, 9, 10, 11, 7, 1, 2, 3, 4, 5, 6]
[[8, 9, 10, 11], [1, 2, 3, 4, 5, 6]]
----------
[9, 1, 2, 3, 1, 1, 2, 3, 5]
[[1, 2, 3], [1, 2, 3]]
----------
[4, 3, 2, 1, 1, 2, 3, 3, 4, 3]
[[1, 2, 3], [3, 4]]
----------
[1, 4, 3]
[]
----------
[1]
[]
----------
[1, 2]
[[1, 2]]
----------
[2, 1]
[]
----------

EDIT My first attemp using itertools.groupby was a fail, sorry for that.

回答6:

With itertools.groupby, the problem of partionning a list of integers L in sublists of adjacent and increasing consecutive items from L can be done with a one-liner. Nevertheless I don't know how pythonic it can be considered ;)

Here is the code with some simple tests:

[EDIT : now subsequences are increasing by 1, I missed this point the first time.]

from itertools import groupby

def f(i):
    return  L[i-1]+1==L[i]


def igroups(L):
    return [[L[I[0]-1]]+[L[i] for i in I] for I in [I for (v,I) in [(k,[i for i in list(g)]) for (k, g) in groupby(range(1, len(L)), f)] if v]]

outputting:

tests=[
    [0, 0, 0, 0],
    [7, 8, 9, 10, 6, 0, 1, 2, 3, 4, 5],
    [8, 9, 10, 11, 7, 1, 2, 3, 4, 5, 6],
    [9, 1, 2, 3, 1, 1, 2, 3, 5],
    [4,3,2,1,1,2,3,3,4,3],
    [1, 4, 3],
    [1],
    [1,2, 2],
    [2,1],
    [0, 0, 0, 0, 2, 5, 5, 8],
    ]
for L in tests:
    print(L)
    print(igroups(L))
    print('-'*10)


[0, 0, 0, 0]
[]
----------
[7, 8, 9, 10, 6, 0, 1, 2, 3, 4, 5]
[[7, 8, 9, 10], [0, 1, 2, 3, 4, 5]]
----------
[8, 9, 10, 11, 7, 1, 2, 3, 4, 5, 6]
[[8, 9, 10, 11], [1, 2, 3, 4, 5, 6]]
----------
[9, 1, 2, 3, 1, 1, 2, 3, 5]
[[1, 2, 3], [1, 2, 3]]
----------
[4, 3, 2, 1, 1, 2, 3, 3, 4, 3]
[[1, 2, 3], [3, 4]]
----------
[1, 4, 3]
[]
----------
[1]
[]
----------
[1, 2, 2]
[[1, 2]]
----------
[2, 1]
[]
----------
[0, 0, 0, 0, 2, 5, 5, 8]
[]
----------

Some explanation. If you "unroll" the code, the logic is more apparant :

from itertools import groupby

def f(i):
    return L[i]==L[i-1]+1

def igroups(L):
    monotonic_states = [(k,list(g)) for (k, g) in groupby(range(1, len(L)), f)]
    increasing_items_indices = [I for (v,I) in monotonic_states if v]
    print("\nincreasing_items_indices ->", increasing_items_indices, '\n')
    full_increasing_items= [[L[I[0]-1]]+[L[i] for i in I] for I in increasing_items_indices]
    return full_increasing_items

L= [2, 8, 4, 5, 6, 7, 8, 5, 9, 10, 11, 12, 25, 26, 27, 42, 41]
print(L)
print(igroups(L))

outputting :

[2, 8, 4, 5, 6, 7, 8, 5, 9, 10, 11, 12, 25, 26, 27, 42, 41]

increasing_items_indices -> [[3, 4, 5, 6], [9, 10, 11], [13, 14]]

[[4, 5, 6, 7, 8], [9, 10, 11, 12], [25, 26, 27]]

We need a key function f that compares an item with the preceding one in the given list. Now, the important point is that the groupby function with the key function f provides a tuple (k, S) where S represents adjacent indices from the initial list and where the state of f is constant, the state being given by the value of k: if k is True, then S represents increasing (by 1) items indices else non-increasing items indices. (in fact, as the example above shows, the list S is incomplete and lacks the first item).

I also made some random tests with one million items lists : igroups function returns always the correct response but is 4 times slower than a naive implementation! Simpler is easier and faster ;)

Thanks alvas for your question, it gives me a lot of fun!

回答7:

A (really) simple implementation:

x = [7, 8, 9, 10, 6, 0, 1, 2, 3, 4, 5]
result = []
current = x[0]
temp = []
for i in xrange(1, len(x)):
    if (x[i] - current == 1):
        temp.append( x[i] )
    else:
         if (len(temp) > 1):
             result.append(temp)
         temp = [ x[i] ]
    current = x[i]
result.append(temp)

And you will get [ [7, 8, 9, 10], [0, 1, 2, 3, 4, 5] ]. From there, you can get the number of increasing numbers by [ len(x) for x in result ] and the total number of numbers sum( len(x) for x in result).

回答8:

I think this works. It's not fancy but it's simple. It constructs a start list sl and an end list el, which should always be the same length, then uses them to index into x:

def igroups(x):
    sl = [i for i in range(len(x)-1)
          if (x == 0 or x[i] != x[i-1]+1) and x[i+1] == x[i]+1]

    el = [i for i in range(1, len(x))
          if x[i] == x[i-1]+1 and (i == len(x)-1 or x[i+1] != x[i]+1)]

    return [x[sl[i]:el[i]+1] for i in range(len(sl))]

来源：https://stackoverflow.com/questions/33402355/finding-groups-of-increasing-numbers-in-a-list

标签

python