Python multi-thread multi-interpreter C API

后端 未结 1 1430
野的像风
野的像风 2021-01-31 04:18

I\'m playing around with the C API for Python, but it is quite difficult to understand some corner cases. I could test it, but it seems a bug-prone and time consuming. So I come

1条回答
  •  时光取名叫无心
    2021-01-31 04:38

    Sub interpreters in Python are not well documented or even well supported. The following is to the best of my undestanding. It seems to work well in practice.

    Threre are two important concepts to understand when dealing with threads and sub interpreters in Python. First, the Python interpreter is not really multi threaded. It has a Global Interpreter Lock (GIL) that needs to be acquired to perform almost any Python operation (there are a few rare exceptions to this rule).

    Second, every combination of thread and sub interpreter has to have its own thread state. The interpreter creates a thread state for every thread managed by it, but if you want to use Python from a thread not created by that interpreter, you need to create a new thread state.

    First you need to create the sub interpreters:

    Initialize Python

    Py_Initialize();
    

    Initialize Python thread support

    Required if you plan to call Python from multiple threads). This call also acquires the GIL.

    PyEval_InitThreads();
    

    Save the current thread state

    I could have used PyEval_SaveThread(), but one of its side effects is releasing the GIL, which then needs to be reacquired.

    PyThreadState* _main = PyThreadState_Get();
    

    Create the sub interpreters

    PyThreadState* ts1 = Py_NewInterpreter();
    PyThreadState* ts2 = Py_NewInterpreter();
    

    Restore the main interpreter thread state

    PyThreadState_Swap(_main);
    

    We now have two thread states for the sub interpreters. These thread states are only valid in the thread where they were created. Every thread that wants to use one of the sub interpreters needs to create a thread state for that combination of thread and interpreter.

    Using a sub interpreter from a new thread

    Here is an example code for using a sub interpreter in a new thread that is not created by the sub interpreter. The new thread must acquire the GIL, create a new thread state for the thread and interpretere combination and make it the current thread state. At the end the reverse must be done to clean up.

    void do_stuff_in_thread(PyInterpreterState* interp)
    {
        // acquire the GIL
        PyEval_AcquireLock(); 
    
        // create a new thread state for the the sub interpreter interp
        PyThreadState* ts = PyThreadState_New(interp);
    
        // make ts the current thread state
        PyThreadState_Swap(ts);
    
        // at this point:
        // 1. You have the GIL
        // 2. You have the right thread state - a new thread state (this thread was not created by python) in the context of interp
    
        // PYTHON WORK HERE
    
        // release ts
        PyThreadState_Swap(NULL);
    
        // clear and delete ts
        PyThreadState_Clear(ts);
        PyThreadState_Delete(ts);
    
        // release the GIL
        PyEval_ReleaseLock(); 
    }
    

    Using a sub interpreter from a new thread (post Python 3.3)

    The previous do_stuff_in_thread() still works with all current Python versions. However, Python 3.3 deprecated PyEval_AcquireLock()/PyEval_ReleaseLock(), which resulted in a bit of a conundrum.

    The only documented way to release the GIL is by calling PyEval_ReleaseThread() or PyEval_SaveThread(), both of which require a thread state, while cleaning and deleting the current thread state requires the GIL to be held. That means that one can either release the GIL or clean up the thread state, but not both.

    Fortunately, there is a solution - PyThreadState_DeleteCurrent() deletes the current thread state and then releases the GIL. [This API has only been documented since 3.9, but it existed since Python 2.7 at least]

    This modified do_stuff_in_thread() also works with all current Python versions.

    void do_stuff_in_thread(PyInterpreterState* interp)
    {
        // create a new thread state for the the sub interpreter interp
        PyThreadState* ts = PyThreadState_New(interp);
    
        // make it the current thread state and acquire the GIL
        PyEval_RestoreThread(ts);
    
        // at this point:
        // 1. You have the GIL
        // 2. You have the right thread state - a new thread state (this thread was not created by python) in the context of interp
    
        // PYTHON WORK HERE
    
        // clear ts
        PyThreadState_Clear(ts);
    
        // delete the current thread state and release the GIL
        PyThreadState_DeleteCurrent();
    }
    

    Now each thread can do the following:

    Thread1

    do_stuff_in_thread(ts1->interp);
    

    Thread2

    do_stuff_in_thread(ts1->interp);
    

    Thread3

    do_stuff_in_thread(ts2->interp);
    

    Calling Py_Finalize() destroys all sub interpreters. Alternatively they can be destroyed manually. This needs to be done in the main thread, using the thread states created when creating the sub interpreters. At the end make the main interpreter thread state the current state.

    // make ts1 the current thread state
    PyThreadState_Swap(ts1);
    // destroy the interpreter
    Py_EndInterpreter(ts1);
    
    // make ts2 the current thread state
    PyThreadState_Swap(ts2);
    // destroy the interpreter
    Py_EndInterpreter(ts2);
    
    // restore the main interpreter thread state
    PyThreadState_Swap(_main);
    

    I hope this make things a bit clearer.

    I have a small complete example written in C++ on github, and another also on github (post Python 3.3 variant).

    0 讨论(0)
提交回复
热议问题