Are locks unnecessary in multi-threaded Python code because of the GIL?

前端 未结 9 1061
遥遥无期
遥遥无期 2020-12-02 08:11

If you are relying on an implementation of Python that has a Global Interpreter Lock (i.e. CPython) and writing multithreaded code, do you really need locks at all?

相关标签:
9条回答
  • 2020-12-02 09:04

    No - the GIL just protects python internals from multiple threads altering their state. This is a very low-level of locking, sufficient only to keep python's own structures in a consistent state. It doesn't cover the application level locking you'll need to do to cover thread safety in your own code.

    The essence of locking is to ensure that a particular block of code is only executed by one thread. The GIL enforces this for blocks the size of a single bytecode, but usually you want the lock to span a larger block of code than this.

    0 讨论(0)
  • 2020-12-02 09:06

    The Global Interpreter Lock prevents threads from accessing the interpreter simultaneously (thus CPython only ever uses one core). However, as I understand it, the threads are still interrupted and scheduled preemptively, which means you still need locks on shared data structures, lest your threads stomp on each other's toes.

    The answer I've encountered time and time again is that multithreading in Python is rarely worth the overhead, because of this. I've heard good things about the PyProcessing project, which makes running multiple processes as "simple" as multithreading, with shared data structures, queues, etc. (PyProcessing will be introduced into the standard library of the upcoming Python 2.6 as the multiprocessing module.) This gets you around the GIL, as each process has its own interpreter.

    0 讨论(0)
  • 2020-12-02 09:08

    Locks are still needed. I will try explaining why they are needed.

    Any operation/instruction is executed in the interpreter. GIL ensures that interpreter is held by a single thread at a particular instant of time. And your program with multiple threads works in a single interpreter. At any particular instant of time, this interpreter is held by a single thread. It means that only thread which is holding the interpreter is running at any instant of time.

    Suppose there are two threads,say t1 and t2, and both want to execute two instructions which is reading the value of a global variable and incrementing it.

    #increment value
    global var
    read_var = var
    var = read_var + 1
    

    As put above, GIL only ensures that two threads can't execute an instruction simultaneously, which means both threads can't execute read_var = var at any particular instant of time. But they can execute instruction one after another and you can still have problem. Consider this situation:

    • Suppose read_var is 0.
    • GIL is held by thread t1.
    • t1 executes read_var = var. So, read_var in t1 is 0. GIL will only ensure that this read operation will not be executed for any other thread at this instant.
    • GIL is given to thread t2.
    • t2 executes read_var = var. But read_var is still 0. So, read_var in t2 is 0.
    • GIL is given to t1.
    • t1 executes var = read_var+1 and var becomes 1.
    • GIL is given to t2.
    • t2 thinks read_var=0, because that's what it read.
    • t2 executes var = read_var+1 and var becomes 1.
    • Our expectation was that var should become 2.
    • So, a lock must be used to keep both reading and incrementing as an atomic operation.
    • Will Harris' answer explains it through a code example.
    0 讨论(0)
提交回复
热议问题