Python Stemming words in a File

て烟熏妆下的殇ゞ 提交于 2019-12-07 22:24:19

问题


I want to do stemming in a file. When I use it in terminal it works fine, but when I apply it in a text file, it does not work. Terminal code:

print PorterStemmer().stem_word('complications')

Function code:

def stemming_text_1():
    with open('test.txt', 'r') as f:
        text = f.read()
        print text
        singles = []    

        stemmer = PorterStemmer() #problem from HERE
        for plural in text:
            singles.append(stemmer.stem(plural))
        print singles

Input test.txt

126211 crashes bookmarks runs error logged debug core bookmarks
126262 manual change crashes bookmarks propagated ion view bookmarks

Desired/expected output

126211 crash bookmark runs error logged debug core bookmark
126262 manual change crash bookmark propagated ion view bookmark

Any suggestion will be greatly appreciated, thanks :)


回答1:


You need to split the text into words for the stemmer to work. Currently, the variable text contains the whole file as one big string. The loop for plural in text: assigns each character in text to plural.

Try for plural in text.split(): instead.

[EDIT] To get the output in the format you want, you need to read the file line by line instead of reading it all at once:

def stemming_text_1():
    with open('test.txt', 'r') as f:
        for line in f:
            print line
            singles = []

            stemmer = PorterStemmer() #problem from HERE
            for plural in line.split():
                singles.append(stemmer.stem(plural))
            print ' '.join(singles)


来源:https://stackoverflow.com/questions/16835372/python-stemming-words-in-a-file

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!