a friend of mine wrote this little progam.
the textFile
is 1.2GB in size (7 years worth of newspapers).
He successfully manages to create the dictionary but he
One solution is to use buzhug instead of pickle. It's a pure Python solution, and retains very Pythonic syntax. I think of it as the next step up from shelve and their ilk. It will handle the data sizes you're talking about. Its size limit is 2 GB per field (each field is stored in a separate file).