Use python 2 shelf in python 3

后端 未结 4 2122
醉酒成梦
醉酒成梦 2021-02-06 10:00

I have data stored in a shelf file created with python 2.7

When I try to access the file from python 3.4, I get an error:

>>> import shelve
>         


        
相关标签:
4条回答
  • 2021-02-06 10:47

    Edited: You may need to rename your database. Read on...

    Seems like pickle is not the culprit here. shelve relies also in anydbm (Python 2.x) or dbm (Python 3) to create/open a database and store the pickled information.

    I created (manually) a database file using the following:

    # Python 2.7
    import anydbm
    anydbm.open('database2', flag='c')
    

    and

    # Python 3.4
    import dbm
    dbm.open('database3', flag='c')
    

    In both cases, it creates the same kind of database (may be distribution dependent, this is on Debian 7):

    $ file *
    database2:    Berkeley DB (Hash, version 9, native byte-order)
    database3.db: Berkeley DB (Hash, version 9, native byte-order)
    

    anydbm can open database3.db without problems, as expected:

    >>> anydbm.open('database3')
    <dbm.dbm object at 0x7fb1089900f0>
    

    Notice the lack of .db when specifying the database name, though. But dbm chokes on database2, which is weird:

    >>> dbm.open('database2')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/lib/python3.4/dbm/__init__.py", line 88, in open
        raise error[0]("db type could not be determined")
    dbm.error: db type could not be determined
    

    unless I change the name of the name of the database to database2.db:

    $ mv database2 database2.db
    $ python3
    >>> import dbm
    >>> dbm.open('database2')
    <_dbm.dbm object at 0x7fa7eaefcf50>
    

    So, I suspect a regression on the dbm module, but I haven't checked the documentation. It may be intended :-?

    NB: Notice that in my case, the extension is .db, but that depends on the database being used by dbm by default! Create an empty shelf using Python 3 to figure out which one are you using and what is it expecting.

    0 讨论(0)
  • 2021-02-06 10:49

    I don't think it's possible to use a Python 2 shelf with Python 3's shelve module. The underlying files are completely different, at least in my tests.

    In Python 2*, a shelf is represented as a single file with the filename you originally gave it.

    In Python 3*, a shelf consists of three files: filename.bak, filename.dat, and filename.dir. Without any of these files present, the shelf cannot be opened by the Python 3 library (though it appears that just the .dat file is sufficient for opening, if not actual reading).

    @Ricardo Cárdenes has given an overview of why this may be--it's likely an issue with the underlying database modules used in storing the shelved data. It's possible that the databases are backwards compatible, but I don't know and a quick search hasn't turned up any obvious answers.

    I think it's likely that some of the possible databases implemented by dbm are backwards-compatible, whereas others are not: this could be the cause of the discrepancy between answers here, where some people, but not all, are able to open older databases directly by specifying a protocol.


    *On every machine I've tested, using Python 2.7.6 vs Pythons 3.2.5, 3.3.4, and 3.4.1

    0 讨论(0)
  • 2021-02-06 10:54

    The shelve module uses Python's pickle, which may require a protocol version when being accessed between different versions of Python.

    Try supplying protocol version 2:

    population = shelve.open('shelved.shelf', protocol=2)
    

    According to the documentation:

    Protocol version 2 was introduced in Python 2.3. It provides much more efficient pickling of new-style classes. Refer to PEP 307 for information about improvements brought by protocol 2.

    This is most likely the protocol used in the original serialization (or pickling).

    0 讨论(0)
  • 2021-02-06 11:03

    As I understand now, here is the path that lead to my problem:

    • The original shelf was created with Python 2 in Windows
    • Python 2 Windows defaults to bsddb as the underlying database for shelving, since dbm is not available on the Windows platform
    • Python 3 does not ship with bsddb. The underlying database is dumbdbm in Python 3 for Windows.

    I at first looked into installing a third party bsddb module for Python 3, but it quickly started to turn into a hassle. It then seemed that it would be a recurring hassle any time I need to use the same shelf file on a new machine. So I decided to convert the file from bsddb to dumbdbm, which both my python 2 and python 3 installations can read.

    I ran the following in Python 2, which is the version that contains both bsddb and dumbdbm:

    import shelve
    import dumbdbm
    
    def dumbdbm_shelve(filename,flag="c"):
        return shelve.Shelf(dumbdbm.open(filename,flag))
    
    out_shelf=dumbdbm_shelve("shelved.dumbdbm.shelf")
    in_shelf=shelve.open("shelved.shelf")
    
    key_list=in_shelf.keys()
    for key in key_list:
        out_shelf[key]=in_shelf[key]
    
    out_shelf.close()
    in_shelf.close()
    

    So far it looks like the dumbdbm.shelf files came out ok, pending a double-check of the contents.

    0 讨论(0)
提交回复
热议问题