Average of elements in a list of list grouped by first item in the list

偶尔善良 提交于 2021-01-27 14:22:17

问题


My list looks like my_list = [['A', 6, 7], ['A', 4, 8], ['B', 9, 3], ['C', 1, 1]], ['B', 10, 7]]

I want to find the averages of the other two columns in each of the inner lists grouped by the first column in each of the inner list.

[['A', 5, 7.5], ['B', 9.5, 5], ['C', 1, 1]]

['A', 5, 7.5] comes from ['A', (6+4)/2 ,(7+8)/2]

I don't mind if I end up getting a dictionary or something, but I would prefer it remain a list.

I've tried the following:


  1. my_list1 = [i[0] for i in my_list] my_list2 = [i[1:] for i in my_list] new_dict = {k: v for k, v in zip(my_list1, my_list2)}

SPLITTING THE ORIGINAL LIST SO the first column becomes KEY, and the second and third columns becomes VALUE, and converting it to a dictionary will give you the aggregate but the problem is

I WANT TO TO PRESERVE THE DECIMAL PLACES, IT ROUNDS UP AND GIVES ME WHOLE NUMBERS INSTEAD OF FLOAT VALUES

my_list1 = ['A', 'A', 'B', 'C', 'B']

my_list2 = [[6, 7], [4, 8], [9, 3], [1, 1], [10, 7]]

new_dict= {'A': [5, 8], 'B': [10, 5], 'C': [1, 1]}

when what I would ideally want is, [['A', 5, 7.5], ['B', 9.5, 5], ['C', 1, 1]] (Don't mind if its a dictionary)


  1. Converted the second and third columns to float maybe using a for loop thinking, then it will give me a float when I convert it to a dictionary.. But no difference, IT ROUNDS UP and gives a A WHOLE NUMBER.

    for i in range(0, len(my_list)):
      for j in range(1, len(my_list[i])):
        my_list[i][j].astype(float)
    
    dict = {}
    
    for l2 in my_list:
      dict[l2[0]] = l2[1:]
    

The reason I need to preserve the decimal places is because the second and third columns refer to x and y coordinates..

So all in all the objective is to find the averages of the other two columns in each of the inner lists grouped by the first column in each of the inner list with as many decimal places as possible


回答1:


Assuming you meant to use the following list:

In [4]: my_list = [['A', 6, 7], ['A', 4, 8], ['B', 9, 3], ['C', 1, 1], ['B', 10, 7]]

The simply use a defaultdict to group by the first element, then find the mean:

In [6]: from collections import defaultdict

In [7]: grouper = defaultdict(list)

In [8]: for k, *tail in my_list:
    ...:     grouper[k].append(tail)
    ...:

In [9]: grouper
Out[9]:
defaultdict(list,
            {'A': [[6, 7], [4, 8]], 'B': [[9, 3], [10, 7]], 'C': [[1, 1]]})

In [10]: import statistics

In [11]: {k: list(map(statistics.mean, zip(*v))) for k,v in grouper.items()}
Out[11]: {'A': [5, 7.5], 'B': [9.5, 5], 'C': [1, 1]}

Note, if you are on Python 2, no need to call list after map. Also, you should use iteritems instead of items.

Also, you will have to do something like:

for sub in my_list:
    grouper[sub[0]].append(sub[1:])

Instead of the cleaner version on Python 3.

Finally, there is no statistics module in Python 2. So just do:

def mean(seq):
    return float(sum(seq))/len(seq)

and use that mean instead of statistics.mean




回答2:


Similarly using itertools.groupby

import operator as op 
import itertools as it
import statistics as stats


iterables = [['A', 6, 7], ['A', 4, 8], ['B', 9, 3], ['C', 1, 1], ['B', 10, 7]]
groups = it.groupby(sorted(iterables), op.itemgetter(0))
{k: list(map(stats.mean, zip(*[i[1:] for i in g]))) for k, g in groups}
# {'A': [5, 7.5], 'B': [9.5, 5], 'C': [1, 1]}


来源:https://stackoverflow.com/questions/45850910/average-of-elements-in-a-list-of-list-grouped-by-first-item-in-the-list

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!