return the top n most frequently occurring chars and their respective counts in python

前端 未结 2 1199
我在风中等你
我在风中等你 2021-01-24 03:05

how to return the top n most frequently occurring chars and their respective counts # e.g \'aaaaaabbbbcccc\', 2 should return [(\'a\', 6), (\'b\'

相关标签:
2条回答
  • 2021-01-24 03:27

    Use collections.Counter(); it has a most_common() method that does just that:

    >>> from collections import Counter
    >>> counts = Counter('aaaaaabbbbcccc')
    >>> counts.most_common(2)
    [('a', 6), ('c', 4)]
    

    Note that for both the above input and in aabc both b and c have the same count, and both can be valid top contenders. Because both you and Counter sort by count then key in reverse, c is sorted before b.

    If instead of sorting in reverse, you used the negative count as the sort key, you'd sort b before c again:

    list4.sort(key=lambda v: (-v[1], v[0))
    

    Not that Counter.most_common() actually uses sorting when your are asking for fewer items than there are keys in the counter; it uses a heapq-based algorithm instead to only get the top N items.

    0 讨论(0)
  • 2021-01-24 03:34

    A little harder, but also works:

    text = "abbbaaaa"
    
    dict = {}
    for lines in text:
        for char in lines:
            dict[char] = dict.get(char, 0) + 1
    
    print dict
    
    0 讨论(0)
提交回复
热议问题