How to tokenize natural English text in an input file in python?

后端 未结 3 985
不知归路
不知归路 2021-01-03 05:26

I want to tokenize input file in python please suggest me i am new user of python .

I read the some thng about the regular expression but still some con

3条回答
  •  时光说笑
    2021-01-03 06:15

    Try something like this:

    import nltk
    file_content = open("myfile.txt").read()
    tokens = nltk.word_tokenize(file_content)
    print tokens
    

    The NLTK tutorial is also full of easy to follow examples: http://nltk.googlecode.com/svn/trunk/doc/book/ch03.html

提交回复
热议问题