Why is the number of stem from NLTK Stemmer outputs different from expected output?
问题 I have to perform Stemming on a text. The questions are as follows : Tokenize all the words given in tc . The word should contain alphabets or numbers or underscore. Store the tokenized list of words in tw Convert all the words into lowercase. Store the result into the variable tw Remove all the stop words from the unique set of tw . Store the result into the variable fw Stem each word present in fw with PorterStemmer, and store the result in the list psw Below is my code : import re import