how can I create word count output in python just by using reduce function?

萝らか妹 提交于 2021-02-09 20:30:24

问题


I have the following list of tuples: [('a', 1), ('a', 1), ('b', 1), ('c',1), ('a', 1), ('c', 1)]

I would like to know if I can utilize python's reduce function to aggregate them and produce the following output : [('a', 3), ('b', 1), ('c', 2)]

Or if there are other ways, I would like to know as well (loop is fine)


回答1:


It seems difficult to achieve using reduce, because if both tuples that you "reduce" don't bear the same letter, you cannot compute the result. How to reduce ('a',1) and ('b',1) to some viable result?

Best I could do was l = functools.reduce(lambda x,y : (x[0],x[1]+y[1]) if x[0]==y[0] else x+y,sorted(l))

it got me ('a', 3, 'b', 1, 'c', 1, 'c', 1). So it kind of worked for the first element, but would need more than one pass to do the other ones (recreating tuples and make another similar reduce, well, not very efficient to say the least!).

Anyway, here are 2 working ways of doing it

First, using collections.Counter counting elements of the same kind:

l = [('a', 1), ('a', 1), ('b', 1), ('c',1), ('a', 1), ('c', 1)]

import collections

c = collections.Counter()
for a,i in l:
    c[a] += i

We cannot use listcomp because each element has a weight (even if here it is 1)

Result: a dictionary: Counter({'a': 3, 'c': 2, 'b': 1})

Second option: use itertools.groupby on the sorted list, grouping by name/letter, and performing the sum on the integers bearing the same letter:

print ([(k,sum(e for _,e in v)) for k,v in itertools.groupby(sorted(l),key=lambda x : x[0])])

result:

[('a', 3), ('b', 1), ('c', 2)]



回答2:


The alternative approach using defaultdict subclass and sum function:

import collections

l = [('a', 1), ('a', 1), ('b', 1), ('c',1), ('a', 1), ('c', 1)]
d = collections.defaultdict(list)
for t in l:
    d[t[0]].append(t[1])

result = [(k,sum(v)) for k,v in d.items()]
print(result)

The output:

[('b', 1), ('a', 3), ('c', 2)]



回答3:


Another way is that to create your custom reduce function.
for example:
l = [('a', 1), ('a', 1), ('b', 1), ('c',1), ('a', 1), ('c', 1)]

def myreduce(func , seq):
    output_dict = {}
    for k,v in seq:
        output_dict[k] = func(output_dict.get(k,0),v)
    return output_dict  

myreduce((lambda sum,value:total+sum),l)

output:
{'a': 3, 'b': 1, 'c': 2}

later on you can modify the generated output as a list of tuples.



来源:https://stackoverflow.com/questions/43172488/how-can-i-create-word-count-output-in-python-just-by-using-reduce-function

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!