CountVectorizer: Vocabulary wasn't fitted

后端 未结 1 734
南方客
南方客 2021-02-19 08:41

I instantiated a sklearn.feature_extraction.text.CountVectorizer object by passing a vocabulary through the vocabulary argument, but I get a sklearn.utils.val

相关标签:
1条回答
  • 2021-02-19 09:33

    For some reason, even though you passed vocabulary=vocabulary_to_load as argument for sklearn.feature_extraction.text.CountVectorizer(), you still need to call loaded_vectorizer._validate_vocabulary() before being able to call loaded_vectorizer.get_feature_names().

    In your example, you should therefore do the following when creating an CountVectorizer object with your vocabulary:

    vocabulary_to_load = pickle.load(open(dictionary_filepath, 'r'))
    loaded_vectorizer = sklearn.feature_extraction.text.CountVectorizer(ngram_range=(ngram_size,
                                            ngram_size), min_df=1, vocabulary=vocabulary_to_load)
    loaded_vectorizer._validate_vocabulary()
    print('loaded_vectorizer.get_feature_names(): {0}'.
      format(loaded_vectorizer.get_feature_names()))
    
    0 讨论(0)
提交回复
热议问题