Finding the most frequent character in a string

前端 未结 10 1869
执笔经年
执笔经年 2020-12-03 21:54

I found this programming problem while looking at a job posting on SO. I thought it was pretty interesting and as a beginner Python programmer I attempted to tackle it. Howe

相关标签:
10条回答
  • 2020-12-03 22:17

    If you want to have all the characters with the maximum number of counts, then you can do a variation on one of the two ideas proposed so far:

    import heapq  # Helps finding the n largest counts
    import collections
    
    def find_max_counts(sequence):
        """
        Returns an iterator that produces the (element, count)s with the
        highest number of occurrences in the given sequence.
    
        In addition, the elements are sorted.
        """
    
        if len(sequence) == 0:
            raise StopIteration
    
        counter = collections.defaultdict(int)
        for elmt in sequence:
            counter[elmt] += 1
    
        counts_heap = [
            (-count, elmt)  # The largest elmt counts are the smallest elmts
            for (elmt, count) in counter.iteritems()]
    
        heapq.heapify(counts_heap)
    
        highest_count = counts_heap[0][0]
    
        while True:
    
            try:
                (opp_count, elmt) = heapq.heappop(counts_heap)
            except IndexError:
                raise StopIteration
    
            if opp_count != highest_count:
                raise StopIteration
    
            yield (elmt, -opp_count)
    
    for (letter, count) in find_max_counts('balloon'):
        print (letter, count)
    
    for (word, count) in find_max_counts(['he', 'lkj', 'he', 'll', 'll']):
        print (word, count)
    

    This yields, for instance:

    lebigot@weinberg /tmp % python count.py
    ('l', 2)
    ('o', 2)
    ('he', 2)
    ('ll', 2)
    

    This works with any sequence: words, but also ['hello', 'hello', 'bonjour'], for instance.

    The heapq structure is very efficient at finding the smallest elements of a sequence without sorting it completely. On the other hand, since there are not so many letter in the alphabet, you can probably also run through the sorted list of counts until the maximum count is not found anymore, without this incurring any serious speed loss.

    0 讨论(0)
  • 2020-12-03 22:20
    #file:filename
    #quant:no of frequent words you want
    
    def frequent_letters(file,quant):
        file = open(file)
        file = file.read()
        cnt = Counter
        op = cnt(file).most_common(quant)
        return op   
    
    0 讨论(0)
  • 2020-12-03 22:25

    Here are a few things I'd do:

    • Use collections.defaultdict instead of the dict you initialise manually.
    • Use inbuilt sorting and max functions like max instead of working it out yourself - it's easier.

    Here's my final result:

    from collections import defaultdict
    
    def find_max_letter_count(word):
        matches = defaultdict(int)  # makes the default value 0
    
        for char in word:
            matches[char] += 1
    
        return max(matches.iteritems(), key=lambda x: x[1])
    
    find_max_letter_count('helloworld') == ('l', 3)
    
    0 讨论(0)
  • 2020-12-03 22:27

    There are many ways to do this shorter. For example, you can use the Counter class (in Python 2.7 or later):

    import collections
    s = "helloworld"
    print(collections.Counter(s).most_common(1)[0])
    

    If you don't have that, you can do the tally manually (2.5 or later has defaultdict):

    d = collections.defaultdict(int)
    for c in s:
        d[c] += 1
    print(sorted(d.items(), key=lambda x: x[1], reverse=True)[0])
    

    Having said that, there's nothing too terribly wrong with your implementation.

    0 讨论(0)
  • 2020-12-03 22:29

    Here's a way using FOR LOOP AND COUNT()

    w = input()
    r = 1
    for i in w:
        p = w.count(i)
        if p > r:
            r = p
            s = i
    print(s)
    
    0 讨论(0)
  • 2020-12-03 22:30

    Here is way to find the most common character using a dictionary

    message = "hello world"
    d = {}
    letters = set(message)
    for l in letters:
        d[message.count(l)] = l
    
    print d[d.keys()[-1]], d.keys()[-1]
    
    0 讨论(0)
提交回复
热议问题