问题
I want to extract the ngrams of the tweets, from two groups of users (0/1), to make a CSV file as follows for a binary classifier.
user_tweets, ngram1, ngram2, ngram3, ..., label
1, 0.0, 0.0, 0.0, ..., 0
2, 0.0, 0.0, 0.0, ..., 1
..
My question is whether I should first extract the important ngrams of the two groups, and then score each ngram that I found in the user's tweets? or is there an easier way to do this?
来源:https://stackoverflow.com/questions/66092089/binary-classification-using-the-n-grams