PyEval_InitThreads in Python 3: How/when to call it? (the saga continues ad nauseam)

前端 未结 7 1008
别那么骄傲
别那么骄傲 2020-11-27 03:06

Basically there seems to be massive confusion/ambiguity over when exactly PyEval_InitThreads() is supposed to be called, and what accompanying API

相关标签:
7条回答
  • 2020-11-27 03:37

    Your understanding is correct: invoking PyEval_InitThreads does, among other things, acquire the GIL. In a correctly written Python/C application, this is not an issue because the GIL will be unlocked in time, either automatically or manually.

    If the main thread goes on to run Python code, there is nothing special to do, because Python interpreter will automatically relinquish the GIL after a number of instructions have been executed (allowing another thread to acquire it, which will relinquish it again, and so on). Additionally, whenever Python is about to invoke a blocking system call, e.g. to read from the network or write to a file, it will release the GIL around the call.

    The original version of this answer pretty much ended here. But there is one more thing to take into account: the embedding scenario.

    When embedding Python, the main thread often initializes Python and goes on to execute other, non-Python-related tasks. In that scenario there is nothing that will automatically release the GIL, so this must be done by the thread itself. That is in no way specific to the call that calls PyEval_InitThreads, it is expected of all Python/C code invoked with the GIL acquired.

    For example, the main() might contain code like this:

    Py_Initialize();
    PyEval_InitThreads();
    
    Py_BEGIN_ALLOW_THREADS
    ... call the non-Python part of the application here ...
    Py_END_ALLOW_THREADS
    
    Py_Finalize();
    

    If your code creates threads manually, they need to acquire the GIL before doing anything Python-related, even as simple as Py_INCREF. To do so, use the following:

    // Acquire the GIL
    PyGILState_STATE gstate;
    gstate = PyGILState_Ensure();
    
    ... call Python code here ...
    
    // Release the GIL. No Python API allowed beyond this point.
    PyGILState_Release(gstate);
    
    0 讨论(0)
  • 2020-11-27 03:42

    I have seen symptoms similar to yours: deadlocks if I only call PyEval_InitThreads(), because my main thread never calls anything from Python again, and segfaults if I unconditionally call something like PyEval_SaveThread(). The symptoms depend on the version of Python and on the situation: I am developing a plug-in that embeds Python for a library that can be loaded as part of a Python extension. The code needs therefore to run independent of whether it is loaded by Python as main.

    The following worked for be with both python2.7 and python3.4, and with my library running within Python and outside of Python. In my plug-in init routine, which is executed in the main thread, I run:

      Py_InitializeEx(0);
      if (!PyEval_ThreadsInitialized()) {
        PyEval_InitThreads();
        PyThreadState* mainPyThread = PyEval_SaveThread();
      }
    

    (mainPyThread is actually some static variable, but I don't think that matters as I never need to use it again).

    Then I create threads using pthreads, and in each function that needs to access the Python API, I use:

      PyGILState_STATE gstate;
      gstate = PyGILState_Ensure();
      // Python C API calls
      PyGILState_Release(gstate);
    
    0 讨论(0)
  • 2020-11-27 03:51

    There are two methods of multi threading while executing C/Python API.

    1.Execution of different threads with same interpreter - We can execute a Python interpreter and share the same interpreter over the different threads.

    The coding will be as follows.

    main(){     
    //initialize Python
    Py_Initialize();
    PyRun_SimpleString("from time import time,ctime\n"
        "print 'In Main, Today is',ctime(time())\n");
    
    //to Initialize and acquire the global interpreter lock
    PyEval_InitThreads();
    
    //release the lock  
    PyThreadState *_save;
    _save = PyEval_SaveThread();
    
    // Create threads.
    for (int i = 0; i<MAX_THREADS; i++)
    {   
        hThreadArray[i] = CreateThread
        //(...
            MyThreadFunction,       // thread function name
        //...)
    
    } // End of main thread creation loop.
    
    // Wait until all threads have terminated.
    //...
    //Close all thread handles and free memory allocations.
    //...
    
    //end python here
    //but need to check for GIL here too
    PyEval_RestoreThread(_save);
    Py_Finalize();
    return 0;
    }
    
    //the thread function
    
    DWORD WINAPI MyThreadFunction(LPVOID lpParam)
    {
    //non Pythonic activity
    //...
    
    //check for the state of Python GIL
    PyGILState_STATE gilState;
    gilState = PyGILState_Ensure();
    //execute Python here
    PyRun_SimpleString("from time import time,ctime\n"
        "print 'In Thread Today is',ctime(time())\n");
    //release the GIL           
    PyGILState_Release(gilState);   
    
    //other non Pythonic activity
    //...
    return 0;
    }
    
    1. Another method is that, we can execute a Python interpreter in the main thread and, to each thread we can give its own sub interpreter. Thus every thread runs with its own separate , independent versions of all imported modules, including the fundamental modules - builtins, __main__ and sys.

    The code is as follows

    int main()
    {
    
    // Initialize the main interpreter
    Py_Initialize();
    // Initialize and acquire the global interpreter lock
    PyEval_InitThreads();
    // Release the lock     
    PyThreadState *_save;
    _save = PyEval_SaveThread();
    
    
    // create threads
    for (int i = 0; i<MAX_THREADS; i++)
    {
    
        // Create the thread to begin execution on its own.
    
        hThreadArray[i] = CreateThread
        //(...
    
            MyThreadFunction,       // thread function name
        //...);   // returns the thread identifier 
    
    } // End of main thread creation loop.
    
      // Wait until all threads have terminated.
    WaitForMultipleObjects(MAX_THREADS, hThreadArray, TRUE, INFINITE);
    
    // Close all thread handles and free memory allocations.
    // ...
    
    
    //end python here
    //but need to check for GIL here too
    //re capture the lock
    PyEval_RestoreThread(_save);
    //end python interpreter
    Py_Finalize();
    return 0;
    }
    
    //the thread functions
    DWORD WINAPI MyThreadFunction(LPVOID lpParam)
    {
    // Non Pythonic activity
    // ...
    
    //create a new interpreter
    PyEval_AcquireLock(); // acquire lock on the GIL
    PyThreadState* pThreadState = Py_NewInterpreter();
    assert(pThreadState != NULL); // check for failure
    PyEval_ReleaseThread(pThreadState); // release the GIL
    
    
    // switch in current interpreter
    PyEval_AcquireThread(pThreadState);
    
    //execute python code
    PyRun_SimpleString("from time import time,ctime\n" "print\n"
        "print 'Today is',ctime(time())\n");
    
    // release current interpreter
    PyEval_ReleaseThread(pThreadState);
    
    //now to end the interpreter
    PyEval_AcquireThread(pThreadState); // lock the GIL
    Py_EndInterpreter(pThreadState);
    PyEval_ReleaseLock(); // release the GIL
    
    // Other non Pythonic activity
    return 0;
    }
    

    It is necessary to note that the Global Interpreter Lock still persists and, in spite of giving individual interpreters to each thread, when it comes to python execution, we can still execute only one thread at a time. GIL is UNIQUE to PROCESS, so in spite of providing unique sub interpreter to each thread, we cannot have simultaneous execution of threads

    Sources: Executing a Python interpreter in the main thread and, to each thread we can give its own sub interpreter

    Multi threading tutorial (msdn)

    0 讨论(0)
  • 2020-11-27 03:54

    The suggestion to call PyEval_SaveThread works

    PyEval_InitThreads();
    PyThreadState* st = PyEval_SaveThread();
    

    However to prevent crash when module is imported, ensure Python APIs to import are protected using

    PyGILState_Ensure and PyGILState_Release

    e.g.

    PyGILState_STATE gstate = PyGILState_Ensure();
    PyObject *pyModule_p = PyImport_Import(pyModuleName_p);
    PyGILState_Release(gstate);
    
    0 讨论(0)
  • 2020-11-27 03:56

    To quote above:

    The short answer: you shouldn't care about releasing the GIL after calling PyEval_InitThreads...

    Now, for a longer answer:

    I'm limiting my answer to be about Python extensions (as opposed to embedding Python). If we are only extending Python, than any entry point into your module is from Python. This by definition means that we don't have to worry about calling a function from a non-Python context, which makes things a bit simpler.

    If threads have NOT be initialized, then we know there is no GIL (no threads == no need for locking), and thus "It is not safe to call this function when it is unknown which thread (if any) currently has the global interpreter lock" does not apply.

    if (!PyEval_ThreadsInitialized())
    {
        PyEval_InitThreads();
    }
    

    After calling PyEval_InitThreads(), a GIL is created and assigned... to our thread, which is the thread currently running Python code. So all is good.

    Now, as far as our own launched worker "C"-threads, they will need to ask for the GIL before running relevant code: so their common methodology is as follows:

    // Do only non-Python things up to this point
    PyGILState_STATE state = PyGILState_Ensure();
    // Do Python-things here, like PyRun_SimpleString(...)
    PyGILState_Release(state);
    // ... and now back to doing only non-Python things
    

    We don't have to worry about deadlock any more than normal usage of extensions. When we entered our function, we had control over Python, so either we were not using threads (thus, no GIL), or the GIL was already assigned to us. When we give control back to the Python run-time by exiting our function, the normal processing loop will check the GIL and hand control of as appropriate to other requesting objects: including our worker threads via PyGILState_Ensure().

    All of this the reader probably already knows. However, the "proof is in the pudding". I've posted a very-minimally-documented example that I wrote today to learn for myself what the behavior actually was, and that things work properly. Sample Source Code on GitHub

    I was learning several things with the example, including CMake integration with Python development, SWIG integration with both of the above, and Python behaviors with extensions and threads. Still, the core of the example allows you to:

    • Load the module -- 'import annoy'
    • Load zero or more worker threads which do Python things -- 'annoy.annoy(n)'
    • Clear any worker threads -- 'annon.annoy(0)'
    • Provide thread cleanup (on Linux) at application exit

    ... and all of this without any crashes or segfaults. At least on my system (Ubuntu Linux w/ GCC).

    0 讨论(0)
  • 2020-11-27 03:56

    You don't need to call that in your extension modules. That's for initializing the interpreter which has already been done if your C-API extension module is being imported. This interface is to be used by embedding applications.

    When is PyEval_InitThreads meant to be called?

    0 讨论(0)
提交回复
热议问题