How to generate word clouds from LDA models in Python?

后端 未结 3 1238
再見小時候
再見小時候 2021-01-01 09:20

I am doing some topic modeling on newspaper articles, and have implemented LDA using gensim in Python3. Now I want to create a word cloud for each topic, using the top 20 wo

相关标签:
3条回答
  • 2021-01-01 09:51

    is there any way to just save the top words for each topic ?

    Yes there is. jLDADMM outputs the top topical words for each topic. In version 1.0, only top topical words are written in the top-word output file, without their probabilities given the topic.

    0 讨论(0)
  • 2021-01-01 09:57

    You may also consider using pyldavis package which can be used to visualize LDA models generated through gensim. An example is shown here

    0 讨论(0)
  • 2021-01-01 10:13

    You can get the topn words from an LDA model using Gensim's built-in method show_topic.

    lda = models.LdaModel.load('lda.model')
    
    for i in range(0, lda.num_topics):
        with open('output_file.txt', 'w') as outfile:
            outfile.write('{}\n'.format('Topic #' + str(i + 1) + ': '))
            for word, prob in lda.show_topic(i, topn=20):
                outfile.write('{}\n'.format(word.encode('utf-8')))
            outfile.write('\n')
    

    This will write a file with a format similar to this:

    Topic #69: 
    pet
    dental
    tooth
    adopt
    animal
    puppy
    rescue
    dentist
    adoption
    animal
    shelter
    pet
    dentistry
    vet
    paw
    pup
    patient
    mix
    foster
    owner
    
    Topic #70: 
    periscope
    disneyland
    disney
    snapchat
    brandon
    britney
    periscope
    periscope
    replay
    britneyspear
    buffaloexchange
    britneyspear
    https
    meerkat
    blab
    periscope
    kxci
    toni
    disneyland
    location
    

    You may or may not need to adjust this to your needs, ie yield a list of top 20 words instead of outputting it to a text file.

    The answer in this post gives a good explanation of how to use raw text to create the word clouds. How do I print lda topic model and the word cloud of each of the topics

    0 讨论(0)
提交回复
热议问题