Pass tokens to CountVectorizer
问题 I have a text classification problem where i have two types of features: features which are n-grams (extracted by CountVectorizer) other textual features (e.g. presence of a word from a given lexicon). These features are different from n-grams since they should be a part of any n-gram extracted from the text. Both types of features are extracted from the text's tokens. I want to run tokenization only once,and then pass these tokens to CountVectorizer and to the other presence features