问题
My list looks like my_list = [['A', 6, 7], ['A', 4, 8], ['B', 9, 3], ['C', 1, 1]], ['B', 10, 7]]
I want to find the averages of the other two columns in each of the inner lists grouped by the first column in each of the inner list.
[['A', 5, 7.5], ['B', 9.5, 5], ['C', 1, 1]]
['A', 5, 7.5]
comes from ['A', (6+4)/2 ,(7+8)/2]
I don't mind if I end up getting a dictionary or something, but I would prefer it remain a list.
I've tried the following:
my_list1 = [i[0] for i in my_list] my_list2 = [i[1:] for i in my_list] new_dict = {k: v for k, v in zip(my_list1, my_list2)}
SPLITTING THE ORIGINAL LIST SO the first column becomes KEY, and the second and third columns becomes VALUE, and converting it to a dictionary will give you the aggregate but the problem is
I WANT TO TO PRESERVE THE DECIMAL PLACES, IT ROUNDS UP AND GIVES ME WHOLE NUMBERS INSTEAD OF FLOAT VALUES
my_list1 = ['A', 'A', 'B', 'C', 'B']
my_list2 = [[6, 7], [4, 8], [9, 3], [1, 1], [10, 7]]
new_dict= {'A': [5, 8], 'B': [10, 5], 'C': [1, 1]}
when what I would ideally want is, [['A', 5, 7.5], ['B', 9.5, 5], ['C', 1, 1]]
(Don't mind if its a dictionary)
Converted the second and third columns to float maybe using a for loop thinking, then it will give me a float when I convert it to a dictionary.. But no difference, IT ROUNDS UP and gives a A WHOLE NUMBER.
for i in range(0, len(my_list)): for j in range(1, len(my_list[i])): my_list[i][j].astype(float) dict = {} for l2 in my_list: dict[l2[0]] = l2[1:]
The reason I need to preserve the decimal places is because the second and third columns refer to x and y coordinates..
So all in all the objective is to find the averages of the other two columns in each of the inner lists grouped by the first column in each of the inner list with as many decimal places as possible
回答1:
Assuming you meant to use the following list:
In [4]: my_list = [['A', 6, 7], ['A', 4, 8], ['B', 9, 3], ['C', 1, 1], ['B', 10, 7]]
The simply use a defaultdict
to group by the first element, then find the mean
:
In [6]: from collections import defaultdict
In [7]: grouper = defaultdict(list)
In [8]: for k, *tail in my_list:
...: grouper[k].append(tail)
...:
In [9]: grouper
Out[9]:
defaultdict(list,
{'A': [[6, 7], [4, 8]], 'B': [[9, 3], [10, 7]], 'C': [[1, 1]]})
In [10]: import statistics
In [11]: {k: list(map(statistics.mean, zip(*v))) for k,v in grouper.items()}
Out[11]: {'A': [5, 7.5], 'B': [9.5, 5], 'C': [1, 1]}
Note, if you are on Python 2, no need to call list
after map
. Also, you should use iteritems
instead of items
.
Also, you will have to do something like:
for sub in my_list:
grouper[sub[0]].append(sub[1:])
Instead of the cleaner version on Python 3.
Finally, there is no statistics
module in Python 2. So just do:
def mean(seq):
return float(sum(seq))/len(seq)
and use that mean
instead of statistics.mean
回答2:
Similarly using itertools.groupby
import operator as op
import itertools as it
import statistics as stats
iterables = [['A', 6, 7], ['A', 4, 8], ['B', 9, 3], ['C', 1, 1], ['B', 10, 7]]
groups = it.groupby(sorted(iterables), op.itemgetter(0))
{k: list(map(stats.mean, zip(*[i[1:] for i in g]))) for k, g in groups}
# {'A': [5, 7.5], 'B': [9.5, 5], 'C': [1, 1]}
来源:https://stackoverflow.com/questions/45850910/average-of-elements-in-a-list-of-list-grouped-by-first-item-in-the-list