Python Disk-Based Dictionary

后端 未结 8 1649
逝去的感伤
逝去的感伤 2020-12-04 16:44

I was running some dynamic programming code (trying to brute-force disprove the Collatz conjecture =P) and I was using a dict to store the lengths of the chains I had alread

相关标签:
8条回答
  • 2020-12-04 17:08

    Hash-on-disk is generally addressed with Berkeley DB or something similar - several options are listed in the Python Data Persistence documentation. You can front it with an in-memory cache, but I'd test against native performance first; with operating system caching in place it might come out about the same.

    0 讨论(0)
  • 2020-12-04 17:09

    The 3rd party shove module is also worth taking a look at. It's very similar to shelve in that it is a simple dict-like object, however it can store to various backends (such as file, SVN, and S3), provides optional compression, and is even threadsafe. It's a very handy module

    from shove import Shove
    
    mem_store = Shove()
    file_store = Shove('file://mystore')
    
    file_store['key'] = value
    
    0 讨论(0)
  • 2020-12-04 17:09

    For simple use cases sqlitedict can help. However when you have much more complex databases you might one to try one of the more upvoted answers.

    0 讨论(0)
  • 2020-12-04 17:10

    Last time I was facing a problem like this, I rewrote to use SQLite rather than a dict, and had a massive performance increase. That performance increase was at least partially on account of the database's indexing capabilities; depending on your algorithms, YMMV.

    A thin wrapper that does SQLite queries in __getitem__ and __setitem__ isn't much code to write.

    0 讨论(0)
  • 2020-12-04 17:12

    You should bring more than one item at a time if there's some heuristic to know which are the most likely items to be retrieved next, and don't forget the indexes like Charles mentions.

    0 讨论(0)
  • 2020-12-04 17:19

    I've read you think shelve is too slow and you tried to hack your own dict using sqlite.

    Another did this too :

    http://sebsauvage.net/python/snyppets/index.html#dbdict

    It seems pretty efficient (and sebsauvage is a pretty good coder). Maybe you could give it a try ?

    0 讨论(0)
提交回复
热议问题