Python pickle: fix \r characters before loading

前端 未结 4 1706
忘掉有多难
忘掉有多难 2021-01-02 13:38

I got a pickled object (a list with a few numpy arrays in it) that was created on Windows and apparently saved to a file loaded as text, not in binary mode (ie. with

4条回答
  •  迷失自我
    2021-01-02 14:28

    Presuming that the file was created with the default protocol=0 ASCII-compatible method, you should be able to load it anywhere by using open('pickled_file', 'rU') i.e. universal newlines.

    If this doesn't work, show us the first few hundred bytes: print repr(open('pickled_file', 'rb').read(200)) and paste the results into an edit of your question.

    Update after file contents were published:

    Your file starts with '\x80\x02'; it was dumped with protocol 2, the latest/best. Protocols 1 and 2 are binary protocols. Your file was written in text mode on Windows. This has resulted in each '\n' being converted to '\r\n' by the C runtime. Files should be opened in binary mode like this:

    with open('result.pickle', 'wb') as f: # b for binary
        pickle.dump(obj, f, pickle.HIGHEST_PROTOCOL)
    
    with open('result.pickle', 'rb') as f: # b for binary
        obj = pickle.load(f)
    

    Docs are here. This code will work portably on both Windows and non-Windows systems.

    You can recover the original pickle image by reading the file in binary mode and then reversing the damage by replacing all occurrences of '\r\n' by '\n'. Note: This recovery procedure is necessary whether you are trying to read it on Windows or not.

提交回复
热议问题