I would like to change the following phrases to vectors with sklearn:
Article 1. It is not good to eat pizza after midnight
Article 2. I wouldn\'t survive a
Look at the docs. It says CountVectorizer.fit_transform
expects an iterable of strings (e.g. a list of strings). You are passing a single string instead.
It makes sense, fit_transform in scikit does two things: 1) it learns a model (fit) 2) it applies the model on the data (transform). You want to build a matrix, where columns are all the words in the vocabulary and rows correspond to the documents. For that you need to know the whole vocabulary in your corpus (all the columns).
This problem occurs when you provide the raw data, means directly giving the string to the extraction function ,instead you can give Y = [X] and pass this Y as the parameter then you will get it correct i faced this problem too