How to set custom stop words for sklearn CountVectorizer?

前端 未结 1 1066
盖世英雄少女心
盖世英雄少女心 2021-01-04 11:02

I\'m trying to run LDA (Latent Dirichlet Allocation) on a non-English text dataset.

From sklearn\'s tutorial, there\'s this part where you count term frequency of th

相关标签:
1条回答
  • 2021-01-04 12:03

    You may just assign a frozenset of your own words to the stop_words argument, e.g.:

    stop_words = frozenset(["word1", "word2","word3"])
    
    0 讨论(0)
提交回复
热议问题