How do I use itertools.groupby()?

前端 未结 13 1752
失恋的感觉
失恋的感觉 2020-11-22 02:14

I haven\'t been able to find an understandable explanation of how to actually use Python\'s itertools.groupby() function. What I\'m trying to do is this:

<
13条回答
  •  星月不相逢
    2020-11-22 02:37

    itertools.groupby is a tool for grouping items.

    From the docs, we glean further what it might do:

    # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B

    # [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D

    groupby objects yield key-group pairs where the group is a generator.

    Features

    • A. Group consecutive items together
    • B. Group all occurrences of an item, given a sorted iterable
    • C. Specify how to group items with a key function *

    Comparisons

    # Define a printer for comparing outputs
    >>> def print_groupby(iterable, keyfunc=None):
    ...    for k, g in it.groupby(iterable, keyfunc):
    ...        print("key: '{}'--> group: {}".format(k, list(g)))
    
    # Feature A: group consecutive occurrences
    >>> print_groupby("BCAACACAADBBB")
    key: 'B'--> group: ['B']
    key: 'C'--> group: ['C']
    key: 'A'--> group: ['A', 'A']
    key: 'C'--> group: ['C']
    key: 'A'--> group: ['A']
    key: 'C'--> group: ['C']
    key: 'A'--> group: ['A', 'A']
    key: 'D'--> group: ['D']
    key: 'B'--> group: ['B', 'B', 'B']
    
    # Feature B: group all occurrences
    >>> print_groupby(sorted("BCAACACAADBBB"))
    key: 'A'--> group: ['A', 'A', 'A', 'A', 'A']
    key: 'B'--> group: ['B', 'B', 'B', 'B']
    key: 'C'--> group: ['C', 'C', 'C']
    key: 'D'--> group: ['D']
    
    # Feature C: group by a key function
    >>> # islower = lambda s: s.islower()                      # equivalent
    >>> def islower(s):
    ...     """Return True if a string is lowercase, else False."""   
    ...     return s.islower()
    >>> print_groupby(sorted("bCAaCacAADBbB"), keyfunc=islower)
    key: 'False'--> group: ['A', 'A', 'A', 'B', 'B', 'C', 'C', 'D']
    key: 'True'--> group: ['a', 'a', 'b', 'b', 'c']
    

    Uses

    • Anagrams (see notebook)
    • Binning
    • Group odd and even numbers
    • Group a list by values
    • Remove duplicate elements
    • Find indices of repeated elements in an array
    • Split an array into n-sized chunks
    • Find corresponding elements between two lists
    • Compression algorithm (see notebook)/Run Length Encoding
    • Grouping letters by length, key function (see notebook)
    • Consecutive values over a threshold (see notebook)
    • Find ranges of numbers in a list or continuous items (see docs)
    • Find all related longest sequences
    • Take consecutive sequences that meet a condition (see related post)

    Note: Several of the latter examples derive from Víctor Terrón's PyCon (talk) (Spanish), "Kung Fu at Dawn with Itertools". See also the groupby source code written in C.

    * A function where all items are passed through and compared, influencing the result. Other objects with key functions include sorted(), max() and min().


    Response

    # OP: Yes, you can use `groupby`, e.g. 
    [do_something(list(g)) for _, g in groupby(lxml_elements, criteria_func)]
    

提交回复
热议问题