I\'ve created an object like this:
company1.name = \'banana\'
company1.value = 40
I would like to save this object. How can I do that?
I think it's a pretty strong assumption to assume that the object is a class
. What if it's not a class
? There's also the assumption that the object was not defined in the interpreter. What if it was defined in the interpreter? Also, what if the attributes were added dynamically? When some python objects have attributes added to their __dict__
after creation, pickle
doesn't respect the addition of those attributes (i.e. it 'forgets' they were added -- because pickle
serializes by reference to the object definition).
In all these cases, pickle
and cPickle
can fail you horribly.
If you are looking to save an object
(arbitrarily created), where you have attributes (either added in the object definition, or afterward)… your best bet is to use dill
, which can serialize almost anything in python.
We start with a class…
Python 2.7.8 (default, Jul 13 2014, 02:29:54)
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pickle
>>> class Company:
... pass
...
>>> company1 = Company()
>>> company1.name = 'banana'
>>> company1.value = 40
>>> with open('company.pkl', 'wb') as f:
... pickle.dump(company1, f, pickle.HIGHEST_PROTOCOL)
...
>>>
Now shut down, and restart...
Python 2.7.8 (default, Jul 13 2014, 02:29:54)
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pickle
>>> with open('company.pkl', 'rb') as f:
... company1 = pickle.load(f)
...
Traceback (most recent call last):
File "", line 2, in
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 1378, in load
return Unpickler(file).load()
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 858, in load
dispatch[key](self)
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 1090, in load_global
klass = self.find_class(module, name)
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 1126, in find_class
klass = getattr(mod, name)
AttributeError: 'module' object has no attribute 'Company'
>>>
Oops… pickle
can't handle it. Let's try dill
. We'll throw in another object type (a lambda
) for good measure.
Python 2.7.8 (default, Jul 13 2014, 02:29:54)
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> class Company:
... pass
...
>>> company1 = Company()
>>> company1.name = 'banana'
>>> company1.value = 40
>>>
>>> company2 = lambda x:x
>>> company2.name = 'rhubarb'
>>> company2.value = 42
>>>
>>> with open('company_dill.pkl', 'wb') as f:
... dill.dump(company1, f)
... dill.dump(company2, f)
...
>>>
And now read the file.
Python 2.7.8 (default, Jul 13 2014, 02:29:54)
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> with open('company_dill.pkl', 'rb') as f:
... company1 = dill.load(f)
... company2 = dill.load(f)
...
>>> company1
<__main__.Company instance at 0x107909128>
>>> company1.name
'banana'
>>> company1.value
40
>>> company2.name
'rhubarb'
>>> company2.value
42
>>>
It works. The reason pickle
fails, and dill
doesn't, is that dill
treats __main__
like a module (for the most part), and also can pickle class definitions instead of pickling by reference (like pickle
does). The reason dill
can pickle a lambda
is that it gives it a name… then pickling magic can happen.
Actually, there's an easier way to save all these objects, especially if you have a lot of objects you've created. Just dump the whole python session, and come back to it later.
Python 2.7.8 (default, Jul 13 2014, 02:29:54)
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> class Company:
... pass
...
>>> company1 = Company()
>>> company1.name = 'banana'
>>> company1.value = 40
>>>
>>> company2 = lambda x:x
>>> company2.name = 'rhubarb'
>>> company2.value = 42
>>>
>>> dill.dump_session('dill.pkl')
>>>
Now shut down your computer, go enjoy an espresso or whatever, and come back later...
Python 2.7.8 (default, Jul 13 2014, 02:29:54)
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> dill.load_session('dill.pkl')
>>> company1.name
'banana'
>>> company1.value
40
>>> company2.name
'rhubarb'
>>> company2.value
42
>>> company2
at 0x1065f2938>
The only major drawback is that dill
is not part of the python standard library. So if you can't install a python package on your server, then you can't use it.
However, if you are able to install python packages on your system, you can get the latest dill
with git+https://github.com/uqfoundation/dill.git@master#egg=dill
. And you can get the latest released version with pip install dill
.