Pickle incompatibility of numpy arrays between Python 2 and 3

前端 未结 7 1572
遥遥无期
遥遥无期 2020-11-28 01:12

I am trying to load the MNIST dataset linked here in Python 3.2 using this program:

import pickle
import gzip
import numpy


with gzip.open(\'mnist.pkl.gz\',         


        
相关标签:
7条回答
  • 2020-11-28 01:53

    I just stumbled upon this snippet. Hope this helps to clarify the compatibility issue.

    import sys
    
    with gzip.open('mnist.pkl.gz', 'rb') as f:
        if sys.version_info.major > 2:
            train_set, valid_set, test_set = pickle.load(f, encoding='latin1')
        else:
            train_set, valid_set, test_set = pickle.load(f)
    
    0 讨论(0)
  • 2020-11-28 01:57

    It appears to be an incompatibility issue between Python 2 and Python 3. I tried loading the MNIST dataset with

        train_set, valid_set, test_set = pickle.load(file, encoding='iso-8859-1')
    

    and it worked for Python 3.5.2

    0 讨论(0)
  • 2020-11-28 01:59

    There is hickle which is faster than pickle and easier. I tried to save and read it in pickle dump but while reading there were a lot of problems and wasted an hour and still didn't find a solution though I was working on my own data to create a chatbot.

    vec_x and vec_y are numpy arrays:

    data=[vec_x,vec_y]
    hkl.dump( data, 'new_data_file.hkl' )
    

    Then you just read it and perform the operations:

    data2 = hkl.load( 'new_data_file.hkl' )
    
    0 讨论(0)
  • 2020-11-28 02:07

    If you are getting this error in python3, then, it could be an incompatibility issue between python 2 and python 3, for me the solution was to load with latin1 encoding:

    pickle.load(file, encoding='latin1')
    
    0 讨论(0)
  • 2020-11-28 02:10

    This seems like some sort of incompatibility. It's trying to load a "binstring" object, which is assumed to be ASCII, while in this case it is binary data. If this is a bug in the Python 3 unpickler, or a "misuse" of the pickler by numpy, I don't know.

    Here is something of a workaround, but I don't know how meaningful the data is at this point:

    import pickle
    import gzip
    import numpy
    
    with open('mnist.pkl', 'rb') as f:
        u = pickle._Unpickler(f)
        u.encoding = 'latin1'
        p = u.load()
        print(p)
    

    Unpickling it in Python 2 and then repickling it is only going to create the same problem again, so you need to save it in another format.

    0 讨论(0)
  • 2020-11-28 02:11

    It looks like there are some compatablility issues in pickle between 2.x and 3.x due to the move to unicode. Your file appears to be pickled with python 2.x and decoding it in 3.x could be troublesome.

    I'd suggest unpickling it with python 2.x and saving to a format that plays more nicely across the two versions you're using.

    0 讨论(0)
提交回复
热议问题