Importing pandas didn\'t throw the error, but rather trying to read a picked pandas dataframe as such:
import numpy as np
import pandas as pd
import matplotl
Saving and loading in different versions of pandas
using pickle
often does not work. Instead, use pandas.HDFStore.
When I needed to update pandas but also needed some data saved with pickle in previous versions, I went back and re-saved that data in HDF format instead, when nothing else would work. No problems anymore.
Works for any sort of pandas data structure it seems, even multi-indexed dataframes! In short, if pickling fails after a version upgrade, try HDFStore; it's more reliable (and more efficient!).
I had this problem from trying to open a pickled dataframe made with pandas 0.18.1 using pandas 0.17.1. If you are using pip, upgrade pandas with:
pip install --upgrade pandas
If you are using a library like anaconda, use:
conda upgrade pandas
If you need to have both versions of pandas on your machine, consider using virtualenv
Here is the solution without updating pandas or whatever your using.
If you're using python2
import cPickle
with open('filename.pkl', 'rb') as fo:
dict = cPickle.load(fo, encoding='latin1’)
If you're using python3
import pickle
with open('filename.pkl', 'rb') as fo:
dict = pickle.load(fo, encoding='latin1’)
In the pandas 0.23.4, there is a better way to fix the problem. Use the pandas.read_pickle
to read the fileobject, like:
pd.read_pickle(open('test_report.pickle', 'rb'))
If you want to read pickled text instead of a file, do
import io
pd.read_pickle(io.BytesIO(pickled_text))
If you face the error - ValueError: Unrecognized compression type: infer
,
explicitly mention the compression type. It could be one of None(no compression), gzip, bz2, xz or zip(depending upon file extension).
pd.read_pickle(io.BytesIO(pickled_text), compression=None)
A flexible way to deal with internal API changes that break unpickling is to implement a custom Unpickler instance.
For example, the pandas.indexes
module has been moved to pandas.core.indexes
. We can write an Unpickler, that adapts the module path accordingly. To do that, we can overwrite the method find_class
:
import sys
class Unpickler(pickle.Unpickler):
def find_class(self, module, name):
'''This method gets called for every module pickle tries to load.'''
# python 2 --> 3 compatibility: __builtin__ has been renamed to builtins
if module == '__builtin__':
module = 'builtins'
# pandas compatibility: in newer versions, pandas.indexes has been moved to pandas.core.indexes
if 'pandas.indexes' in module:
module = module.replace('pandas.indexes', 'pandas.core.indexes')
__import__(module)
return getattr(sys.modules[module], name)
with open('/path/to/pickle.pkl', 'rb') as file:
pdf = Unpickler(file).load()