CountVectorizer: “I” not showing up in vectorized text

前端 未结 2 1851
长情又很酷
长情又很酷 2021-02-04 10:02

I\'m new to scikit-learn, and currently studying Naïve Bayes (Multinomial). Right now, I\'m working on vectorizing text from sklearn.feature_extraction.text, and for some reason

2条回答
  •  孤城傲影
    2021-02-04 10:51

    This is because capital letter detection is by default turned off lowercase=True in CountVectorizer

    Use

    vectorizer_train = CountVectorizer(min_df=0, lowercase=False)
    

提交回复
热议问题