Can serialized objects be accessed simultaneously by different processes, and how do they behave if so?

前端 未结 2 1203
别那么骄傲
别那么骄傲 2021-01-26 06:56

I have data that is best represented by a tree. Serializing the structure makes the most sense, because I don\'t want to sort it every time, and it would allow me to make persis

相关标签:
2条回答
  • 2021-01-26 07:06

    Without trying it out I'm fairly sure the answer is:

    1. They can both be served at once, however, if one user is reading while the other is writing the reading user may get strange results.
    2. Probably not. Once the tree has been read from the file into memory the other user will not see edits of the first user. If the tree hasn't been read from the file then the change will still be detected.
    3. Both changes will be made simultaneously and the file will likely be corrupted.

    Also, you mentioned shelve. From the shelve documentation:

    The shelve module does not support concurrent read/write access to shelved objects. (Multiple simultaneous read accesses are safe.) When a program has a shelf open for writing, no other program should have it open for reading or writing. Unix file locking can be used to solve this, but this differs across Unix versions and requires knowledge about the database implementation used.

    Personally, at this point, you may want to look into using a simple key-value store like Redis with some kind of optimistic locking.

    0 讨论(0)
  • 2021-01-26 07:21

    You might try klepto, which provides a dictionary interface to a sql database (using sqlalchemy under the covers). If you choose to persist your data to a mysql, postgresql, or other available database (aside from sqlite), then you can have two or more people access the data simultaneously or have two threads/processes access the database tables -- and have the database manage the concurrent read-writes. Using klepto with a database backend will perform under concurrent access as well as if you were accessing the database directly. If you don't want to use a database backend, klepto can write to disk as well -- however there is some potential for conflict when writing to disk -- even though klepto uses a "copy-on-write, then replace" strategy that minimizes concurrency conflicts when working with files on disk. When working with a file (or directory) backend, your issues 1-2-3 are still handled due to the strategy klepto employs for saving writes to disk. Additionally, klepto can use a in-memory caching layer that enables fast access, where loads/dumps from the on-disk (or database) backend are done either on-demand or when the in-memory cache reaches a user-determined size.

    To be specific: (1) both are served at the same time. (2) if one user makes an edit, the other user sees the change -- however that change may be 'delayed' if the second user is using an in-memory caching layer. (3) multiple simultaneous writes are not a problem, due to klepto letting NFS or the sql database handle the "copy-on-write, then replace" changes.

    The dictionary interface for klepto.archvives is also available in a decorator form that provided LRU caching (and LFU and others), so if you have a function that is generating/accessing the data, hooking up the archive is really easy -- you get memorization with an on-disk or database backend.

    With klepto, you can pick from several different serialization methods to encrypt your data. You can have klepto cast data to a string, or use a hashing algorithm (like md5), or use a pickler (like json, pickle, or dill).

    You can get klepto here: https://github.com/uqfoundation/klepto

    0 讨论(0)
提交回复
热议问题