I am using the Python based Sage Mathematics software to create a very long list of vectors. The list contains roughly 100,000,000 elements and sys.getsizeof() tells me that it is of size a little less than 1GB.
This list I pickle into a file (which already takes a long time -- but fair enough). Only when I unpickle this list it gets annoying. The RAM usage increases from 1.15GB to 4.3GB, and I am wondering what's going on?
How can I find out in Sage what all the memory is used for? And do you have any ideas how to optimize this by maybe applying Python tricks?
This is a reply to the comment of kcrisman.
The exact code I cannot post since it would be too long. But here is a simple example where the phenomena can be observed. I am working on Linux 3.2.0-4-amd64 #1 SMP Debian 3.2.51-1 x86_64 GNU/Linux.
Start Sage and execute:
import pickle
L = [vector([1,2,3]) for k in range(1000000)]
f = open("mylist", 'w')
pickle.dump(L, f)
On my system the list is 8697472 bytes big, and the file I pickled into has roughly 130MB. Now close Sage and watch your memory (with htop, for example). Then execute the following lines:
import pickle
f = open("mylist", 'r')
pickle.load(f)
Without sage my Linux system uses 1035MB of memory, when Sage is running the usage increases to 1131MB. After I unpickled the file it uses 2535MB which I find odd.
It's probably better to not use python's pickle module directly. cPickle is already a bit better, but a lot of pickling in sage assumes protocol 2, which (c)Pickle doesn't default to. You can use sage's own wrappers of pickle. If I do your example with
sage: open("mylist",'w').write(dumps(L))
and then load it in a fresh session via
sage: L = loads(open("mylist",'r').read())
I observe no problems.
Note that the above interface is not the best one to pickle/unpickle in sage to a file. You'd be better off using save/load
. I just did it that way to stay as close as possible to your example.
来源:https://stackoverflow.com/questions/20294628/using-pythons-pickle-in-sage-results-in-high-memory-usage