Here is the code:
tokenizer = Tokenizer(num_words=vocab_size, char_level=False, split=\' \') tokenizer.fit_on_texts(train_df.Text) vec_text = tokenizer.texts_