Sorting and Grouping Nested Lists in Python

后端 未结 8 1570
没有蜡笔的小新
没有蜡笔的小新 2020-11-30 20:05

I have the following data structure (a list of lists)

[
 [\'4\', \'21\', \'1\', \'14\', \'2008-10-24 15:42:58\'], 
 [\'3\', \'22\', \'4\', \'2somename\', \'2         


        
相关标签:
8条回答
  • 2020-11-30 20:58

    For the first question, the first thing you should do is sort the list by the second field using itemgetter from the operator module:

    x = [
     ['4', '21', '1', '14', '2008-10-24 15:42:58'], 
     ['3', '22', '4', '2somename', '2008-10-24 15:22:03'], 
     ['5', '21', '3', '19', '2008-10-24 15:45:45'], 
     ['6', '21', '1', '1somename', '2008-10-24 15:45:49'], 
     ['7', '22', '3', '2somename', '2008-10-24 15:45:51']
    ]
    
    from operator import itemgetter
    
    x.sort(key=itemgetter(1))
    

    Then you can use itertools' groupby function:

    from itertools import groupby
    y = groupby(x, itemgetter(1))
    

    Now y is an iterator containing tuples of (element, item iterator). It's more confusing to explain these tuples than it is to show code:

    for elt, items in groupby(x, itemgetter(1)):
        print(elt, items)
        for i in items:
            print(i)
    

    Which prints:

    21 <itertools._grouper object at 0x511a0>
    ['4', '21', '1', '14', '2008-10-24 15:42:58']
    ['5', '21', '3', '19', '2008-10-24 15:45:45']
    ['6', '21', '1', '1somename', '2008-10-24 15:45:49']
    22 <itertools._grouper object at 0x51170>
    ['3', '22', '4', '2somename', '2008-10-24 15:22:03']
    ['7', '22', '3', '2somename', '2008-10-24 15:45:51']
    

    For the second part, you should use list comprehensions as mentioned already here:

    from pprint import pprint as pp
    pp([y for y in x if y[3] == '2somename'])
    

    Which prints:

    [['3', '22', '4', '2somename', '2008-10-24 15:22:03'],
     ['7', '22', '3', '2somename', '2008-10-24 15:45:51']]
    
    0 讨论(0)
  • 2020-11-30 21:04

    If you'll be doing a lot of sorting and filtering, you may like some helper functions.

    m = [
     ['4', '21', '1', '14', '2008-10-24 15:42:58'], 
     ['3', '22', '4', '2somename', '2008-10-24 15:22:03'], 
     ['5', '21', '3', '19', '2008-10-24 15:45:45'], 
     ['6', '21', '1', '1somename', '2008-10-24 15:45:49'], 
     ['7', '22', '3', '2somename', '2008-10-24 15:45:51']
    ]
    
    # Sort and filter helpers.
    sort_on   = lambda pos:     lambda x: x[pos]
    filter_on = lambda pos,val: lambda l: l[pos] == val
    
    # Sort by second column
    m = sorted(m, key=sort_on(1))
    
    # Filter on 4th column, where value = '2somename'
    m = filter(filter_on(3,'2somename'),m)
    
    0 讨论(0)
提交回复
热议问题