I have to count the word frequency in a text using python. I thought of keeping words in a dictionary and having a count for each of these words.
Now if I have to so
You could use Counter
and defaultdict
in the Python 2.7 collections
module in a two-step process. First use Counter
to create a dictionary where each word is a key with the associated frequency count. This is fairly trivial.
Secondly defaultdict
could be used to create an inverted or reversed dictionary where the keys are the frequency of occurrence and the associated values are lists of the word or words that were encountered that many times. Here's what I mean:
from collections import Counter, defaultdict
wordlist = ['red', 'yellow', 'blue', 'red', 'green', 'blue', 'blue', 'yellow']
# invert a temporary Counter(wordlist) dictionary so keys are
# frequency of occurrence and values are lists the words encountered
freqword = defaultdict(list)
for word, freq in Counter(wordlist).items():
freqword[freq].append(word)
# print in order of occurrence (with sorted list of words)
for freq in sorted(freqword):
print('count {}: {}'.format(freq, sorted(freqword[freq])))
Output:
count 1: ['green']
count 2: ['red', 'yellow']
count 3: ['blue']