问题
I have the following list of tuples: [('a', 1), ('a', 1), ('b', 1), ('c',1), ('a', 1), ('c', 1)]
I would like to know if I can utilize python's reduce
function to aggregate them and produce the following output : [('a', 3), ('b', 1), ('c', 2)]
Or if there are other ways, I would like to know as well (loop is fine)
回答1:
It seems difficult to achieve using reduce
, because if both tuples that you "reduce" don't bear the same letter, you cannot compute the result. How to reduce ('a',1)
and ('b',1)
to some viable result?
Best I could do was l = functools.reduce(lambda x,y : (x[0],x[1]+y[1]) if x[0]==y[0] else x+y,sorted(l))
it got me ('a', 3, 'b', 1, 'c', 1, 'c', 1)
. So it kind of worked for the first element, but would need more than one pass to do the other ones (recreating tuples and make another similar reduce
, well, not very efficient to say the least!).
Anyway, here are 2 working ways of doing it
First, using collections.Counter
counting elements of the same kind:
l = [('a', 1), ('a', 1), ('b', 1), ('c',1), ('a', 1), ('c', 1)]
import collections
c = collections.Counter()
for a,i in l:
c[a] += i
We cannot use listcomp because each element has a weight (even if here it is 1)
Result: a dictionary: Counter({'a': 3, 'c': 2, 'b': 1})
Second option: use itertools.groupby
on the sorted list, grouping by name/letter, and performing the sum on the integers bearing the same letter:
print ([(k,sum(e for _,e in v)) for k,v in itertools.groupby(sorted(l),key=lambda x : x[0])])
result:
[('a', 3), ('b', 1), ('c', 2)]
回答2:
The alternative approach using defaultdict subclass and sum function:
import collections
l = [('a', 1), ('a', 1), ('b', 1), ('c',1), ('a', 1), ('c', 1)]
d = collections.defaultdict(list)
for t in l:
d[t[0]].append(t[1])
result = [(k,sum(v)) for k,v in d.items()]
print(result)
The output:
[('b', 1), ('a', 3), ('c', 2)]
回答3:
Another way is that to create your custom reduce function.
for example:
l = [('a', 1), ('a', 1), ('b', 1), ('c',1), ('a', 1), ('c', 1)]
def myreduce(func , seq):
output_dict = {}
for k,v in seq:
output_dict[k] = func(output_dict.get(k,0),v)
return output_dict
myreduce((lambda sum,value:total+sum),l)
output:
{'a': 3, 'b': 1, 'c': 2}
later on you can modify the generated output as a list of tuples.
来源:https://stackoverflow.com/questions/43172488/how-can-i-create-word-count-output-in-python-just-by-using-reduce-function