I want to tokenize input file in python please suggest me i am new user of python .
tokenize input file in python
I read the some thng about the regular expression but still some con
Try something like this:
import nltk file_content = open("myfile.txt").read() tokens = nltk.word_tokenize(file_content) print tokens
The NLTK tutorial is also full of easy to follow examples: http://nltk.googlecode.com/svn/trunk/doc/book/ch03.html