In the word cloud I have repetitive words and I do not understand why they are not counted together and are shown then as one word.
from wordcloud import Wor
That is a feature called 'collocations' in the word_cloud project. You can turn it off by setting collocations=False, like this:
wordcloud = WordCloud(collocations=False).generate(word_string)
This will get rid of words that are frequently grouped together in your text. It will get rid of some things you probably don't like, for instance, "oh oh" and it will get rid of some others that you may like, for instance, "black culture"
If you look at wordcloud.words_
you will see the frequency table includes some two-word phrases like 'oh oh', 'hook start', 'lets go', 'lets hook'.
You would need to dig into the code behind .process_text()
to see exactly why it does this.
As a work-around you could split word_string
yourself to build a word-frequency table, then use .generate_from_frequencies()
to create the image.