Using a global dictionary with threads in Python

后端 未结 5 767
野性不改
野性不改 2020-11-29 21:55

Is accessing/changing dictionary values thread-safe?

I have a global dictionary foo and multiple threads with ids id1, id2, ..

相关标签:
5条回答
  • 2020-11-29 21:56

    The GIL takes care of that, if you happen to be using CPython.

    global interpreter lock

    The lock used by Python threads to assure that only one thread executes in the CPython virtual machine at a time. This simplifies the CPython implementation by assuring that no two processes can access the same memory at the same time. Locking the entire interpreter makes it easier for the interpreter to be multi-threaded, at the expense of much of the parallelism afforded by multi-processor machines. Efforts have been made in the past to create a “free-threaded” interpreter (one which locks shared data at a much finer granularity), but so far none have been successful because performance suffered in the common single-processor case.

    See are-locks-unnecessary-in-multi-threaded-python-code-because-of-the-gil.

    0 讨论(0)
  • 2020-11-29 21:58

    Assuming CPython: Yes and no. It is actually safe to fetch/store values from a shared dictionary in the sense that multiple concurrent read/write requests won't corrupt the dictionary. This is due to the global interpreter lock ("GIL") maintained by the implementation. That is:

    Thread A running:

    a = global_dict["foo"]
    

    Thread B running:

    global_dict["bar"] = "hello"
    

    Thread C running:

    global_dict["baz"] = "world"
    

    won't corrupt the dictionary, even if all three access attempts happen at the "same" time. The interpreter will serialize them in some undefined way.

    However, the results of the following sequence is undefined:

    Thread A:

    if "foo" not in global_dict:
       global_dict["foo"] = 1
    

    Thread B:

    global_dict["foo"] = 2
    

    as the test/set in thread A is not atomic ("time-of-check/time-of-use" race condition). So, it is generally best, if you lock things:

    from threading import RLock
    
    lock = RLock()
    
    def thread_A():
        with lock:
            if "foo" not in global_dict:
                global_dict["foo"] = 1
    
    def thread_B():
        with lock:
            global_dict["foo"] = 2
    
    0 讨论(0)
  • 2020-11-29 22:05

    Since I needed something similar, I landed here. I sum up your answers in this short snippet :

    #!/usr/bin/env python3
    
    import threading
    
    class ThreadSafeDict(dict) :
        def __init__(self, * p_arg, ** n_arg) :
            dict.__init__(self, * p_arg, ** n_arg)
            self._lock = threading.Lock()
    
        def __enter__(self) :
            self._lock.acquire()
            return self
    
        def __exit__(self, type, value, traceback) :
            self._lock.release()
    
    if __name__ == '__main__' :
    
        u = ThreadSafeDict()
        with u as m :
            m[1] = 'foo'
        print(u)
    

    as such, you can use the with construct to hold the lock while fiddling in your dict()

    0 讨论(0)
  • 2020-11-29 22:14

    The best, safest, portable way to have each thread work with independent data is:

    import threading
    tloc = threading.local()
    

    Now each thread works with a totally independent tloc object even though it's a global name. The thread can get and set attributes on tloc, use tloc.__dict__ if it specifically needs a dictionary, etc.

    Thread-local storage for a thread goes away at end of thread; to have threads record their final results, have them put their results, before they terminate, into a common instance of Queue.Queue (which is intrinsically thread-safe). Similarly, initial values for data a thread is to work on could be arguments passed when the thread is started, or be taken from a Queue.

    Other half-baked approaches, such as hoping that operations that look atomic are indeed atomic, may happen to work for specific cases in a given version and release of Python, but could easily get broken by upgrades or ports. There's no real reason to risk such issues when a proper, clean, safe architecture is so easy to arrange, portable, handy, and fast.

    0 讨论(0)
  • 2020-11-29 22:16

    How it works?:

    >>> import dis
    >>> demo = {}
    >>> def set_dict():
    ...     demo['name'] = 'Jatin Kumar'
    ...
    >>> dis.dis(set_dict)
      2           0 LOAD_CONST               1 ('Jatin Kumar')
                  3 LOAD_GLOBAL              0 (demo)
                  6 LOAD_CONST               2 ('name')
                  9 STORE_SUBSCR
                 10 LOAD_CONST               0 (None)
                 13 RETURN_VALUE
    

    Each of the above instructions is executed with GIL lock hold and STORE_SUBSCR instruction adds/updates the key+value pair in a dictionary. So you see that dictionary update is atomic and hence thread safe.

    0 讨论(0)
提交回复
热议问题