nltk stemmer: string index out of range

前端 未结 2 1812
轮回少年
轮回少年 2021-02-05 13:11

I have a set of pickled text documents which I would like to stem using nltk\'s PorterStemmer. For reasons specific to my project, I would like to do the stemming i

2条回答
  •  旧巷少年郎
    2021-02-05 13:34

    This is an NLTK bug specific to NLTK version 3.2.2, for which I am to blame. It was introduced by PR https://github.com/nltk/nltk/pull/1261 which rewrote the Porter stemmer.

    I wrote a fix which went out in NLTK 3.2.3. If you're on version 3.2.2 and want the fix, just upgrade - e.g. by running

    pip install -U nltk
    

提交回复
热议问题