I have a Dataframe that is a CountVectorizer, where each column is a different word from the corpus, each row represents a different document, and the values are binary to wheth