return the top n most frequently occurring chars and their respective counts in python

前端 未结 2 1198
我在风中等你
我在风中等你 2021-01-24 03:05

how to return the top n most frequently occurring chars and their respective counts # e.g \'aaaaaabbbbcccc\', 2 should return [(\'a\', 6), (\'b\'

2条回答
  •  抹茶落季
    2021-01-24 03:27

    Use collections.Counter(); it has a most_common() method that does just that:

    >>> from collections import Counter
    >>> counts = Counter('aaaaaabbbbcccc')
    >>> counts.most_common(2)
    [('a', 6), ('c', 4)]
    

    Note that for both the above input and in aabc both b and c have the same count, and both can be valid top contenders. Because both you and Counter sort by count then key in reverse, c is sorted before b.

    If instead of sorting in reverse, you used the negative count as the sort key, you'd sort b before c again:

    list4.sort(key=lambda v: (-v[1], v[0))
    

    Not that Counter.most_common() actually uses sorting when your are asking for fewer items than there are keys in the counter; it uses a heapq-based algorithm instead to only get the top N items.

提交回复
热议问题