问题
There are many resources online that shows how to do a word count for single word
like this and this and this and others...
But I was not not able to find a concrete example for two words count frequency .
I have a csv file that has some strings in it.
FileList = "I love TV show makes me happy, I love also comedy show makes me feel like flying"
So I want the output to be like :
wordscount = {"I love": 2, "show makes": 2, "makes me" : 2 }
Of course I will have to strip all the comma, interrogation points.... {!, , ", ', ?, ., (,), [, ], ^, %, #, @, &, *, -, _, ;, /, \, |, }
I will also remove some stop words which I found here just to get more concrete data from the text.
How can I achieve this results using python?
Thanks!
回答1:
>>> from collections import Counter
>>> import re
>>>
>>> sentence = "I love TV show makes me happy, I love also comedy show makes me feel like flying"
>>> words = re.findall(r'\w+', sentence)
>>> two_words = [' '.join(ws) for ws in zip(words, words[1:])]
>>> wordscount = {w:f for w, f in Counter(two_words).most_common() if f > 1}
>>> wordscount
{'show makes': 2, 'makes me': 2, 'I love': 2}
来源:https://stackoverflow.com/questions/18952894/word-frequency-count-based-on-two-words-using-python