I am a beginner in Python and NLTK. I am trying to run the following code from a tutorial:
from nltk.corpus import gutenberg
from nltk import FreqDist
fd =
Some of the features have been deprecated.
The code in question does work on version nltk 2.0.4
https://pypi.python.org/pypi/nltk/2.0.4
To install version 2.0.4 follow:
wget https://pypi.python.org/packages/source/n/nltk/nltk-2.0.4.zip#md5=cbd04d8635f1358a69a38c4774be029c
7z x nltk-2.0.4.zip
cd nltk-2.0.4/
python setup.py install
To check which version is installed run the following:
pip search nltk
For people looking for how to change the book example to NLTK 3.0:
import nltk
from nltk.corpus import brown
suffix_fdist = nltk.FreqDist()
for word in brown.words():
word = word.lower()
suffix_fdist[word[-1:]] +=1
suffix_fdist[word[-2:]] +=1
suffix_fdist[word[-3:]] +=1
common_suffixes = []
for suffix in suffix_fdist.most_common(100):
common_suffixes.append(str(suffix.__getitem__(0)))
print common_suffixes
Latest version of nltk doesn't have inc. Rather I used update.
from nltk.corpus import gutenberg
from nltk import FreqDist
fd = FreqDist()
for word in gutenberg.words('austen-sense.txt'):
fd.update([word])
The update takes iterable item. So make sure you are passing iterable item in update function.
You should do it like so:
fd[word] += 1
But usually FreqDist is used like this:
fd = FreqDist(my_text)
Also look at the examples here:
http://www.nltk.org/book/ch01.html