Get frequency of letters in a sentence

北城以北 提交于 2019-12-11 08:12:57

问题


I am trying to make a code where I can input a random sentence, and count the frequency of the times a letter returns in this string:

def getfreq(lines):
    """ calculate a list with letter frequencies

    lines - list of lines (character strings)

    both lower and upper case characters are counted.
    """
    totals = 26*[0]
    chars = []
    for line in lines:
       for ch in line:
           chars.append(totals)

    return totals

    # convert totals to frequency
    freqlst = []
    grandtotal = sum(totals)

    for total in totals:
        freq = totals.count(chars)
        freqlst.append(freq)
    return freqlst

So far I have achieved to append each letter of the input in the list (chars). But now I need a way to count the amount of times a character returns in that list, and express this in a frequency.


回答1:


Without a collections.Counter:

import collections

sentence = "A long sentence may contain repeated letters"

count = collections.defaultdict(int)  # save some time with a dictionary factory
for letter in sentence:  # iterate over each character in the sentence
    count[letter] += 1  # increase count for each of the sentences

Or if you really want to do it fully manually:

sentence = "A long sentence may contain repeated letters"

count = {}  # a counting dictionary
for letter in sentence:  # iterate over each character in the sentence
    count[letter] = count.get(letter, 0) + 1  # get the current value and increase by 1

In both cases count dictionary will have each different letter as its key and its value will be the number of times a letter was encountered, e.g.:

print(count["e"])  # 8

If you want to have it case-insensitive, be sure to call letter.lower() when adding it to the count.




回答2:


There's a very handy function, Counter, within the collections module which will compute the frequency of objects within a sequence:

import collections
collections.Counter('A long sentence may contain repeated letters')

which will produce:

Counter({' ': 6,
         'A': 1,
         'a': 3,
         'c': 2,
         'd': 1,
         'e': 8,
         'g': 1,
         'i': 1,
         'l': 2,
         'm': 1,
         'n': 5,
         'o': 2,
         'p': 1,
         'r': 2,
         's': 2,
         't': 5,
         'y': 1})

In your case, you might want to concatenate your lines, e.g. using ''.join(lines) before passing into the Counter.

If you want to achieve a similar result using raw dictionaries, you might want to do something like the following:

counts = {}
for c in my_string:
    counts[c] = counts.get(c, 0) + 1

Depending on your version of Python, this may be slower, but uses the .get() method of dict to either return an existing count or a default value before incrementing the count for each character in your string.




回答3:


You can use a set to reduce the text to unique characters and then just count:

text = ' '.join(lines)  # Create one long string
# Then create a set of all unique characters in the text
characters = {char for char in text if char.isalpha()}
statistics = {}         # Create a dictionary to hold the results
for char in characters: # Loop through unique characters
    statistics[char] = text.count(char) # and count them


来源:https://stackoverflow.com/questions/50428095/get-frequency-of-letters-in-a-sentence

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!