how to aggregate elements of a list of tuples if the tuples have the same first element?

后端 未结 4 718
日久生厌
日久生厌 2021-02-13 19:01

I have a list in which each value is a list of tuples. for example this is the value which I extract for a key :

     [(\'1998-01-20\',8) , (\'1998-01-22\',4) ,         


        
相关标签:
4条回答
  • 2021-02-13 19:08

    I like to use defaultdict for counting:

    from collections import defaultdict
    
    lst = [('1998-01-20',8) , ('1998-01-22',4) , ('1998-06-18',8 ) , ('1999-07-15' , 7), ('1999-07-21',1)]
    
    result = defaultdict(int)
    
    for date, cnt in lst:
        year, month, day = date.split('-')
        result['-'.join([year, month])] += cnt
    
    print(result)
    
    0 讨论(0)
  • 2021-02-13 19:13

    Just use defaultdict:

    from collections import defaultdict
    
    
    DATA = [
        ('1998-01-20', 8),
        ('1998-01-22', 4),
        ('1998-06-18', 8),
        ('1999-07-15', 7),
        ('1999-07-21', 1),
    ]
    
    
    groups = defaultdict(int)
    for date, value in DATA:
        groups[date[:7]] += value
    
    
    from pprint import pprint
    pprint(groups)
    
    0 讨论(0)
  • 2021-02-13 19:19

    Try using itertools.groupby to aggregate values by month:

    from itertools import groupby
    a = [('1998-01-20', 8), ('1998-01-22', 4), ('1998-06-18', 8), 
         ('1999-07-15', 7), ('1999-07-21', 1)]
    
    for key, group in groupby(a, key=lambda x: x[0][:7]):
        print key, sum(j for i, j in group)
    
    # Output
    
    1998-01 12
    1998-06 8
    1999-07 8
    

    Here's a one-liner version:

    print [(key, sum(j for i, j in group)) for key, group in groupby(a, key=lambda x: x[0][:7])]
    
    # Output
    
    [('1998-01', 12), ('1998-06', 8), ('1999-07', 8)]
    
    0 讨论(0)
  • 2021-02-13 19:26

    Yet another answer different from the ones given already. You can simpy create a new dictionary where the keys are the year-month combinations. A loop over the dates in your list + using dictionary.get(key, defaultvalue) should do the trick. IT adds the current value to the value in the new dictionary, if the key did not yet exist, it returns the default value 0 and creates the key.

    data = [('1998-01-20',8) , ('1998-01-22',4) , ('1998-06-18',8 ) , ('1999-07-15' , 7), ('1999-07-21',1)]
    dictionary = dict()
    for (mydate, val) in data: #
        ym = mydate[0:7]    # the key is only the year month combination (i.e. '1998-01' for example)
        dictionary[ym] = dictionary.get(ym, 0) + val  # return the value for that key or return default 0 (and create key)
    
    data_aggregated = [(key, val) for (key, val) in dictionary.iteritems()] # if you need it back in old format
    
    0 讨论(0)
提交回复
热议问题