cython openmp single, barrier

前端 未结 1 1848
一向
一向 2021-01-22 01:51

I\'m trying to use openmp in cython. I need to do two things in cython:

i) use the #pragma omp single{} scope in my cython code.

ii) use the

1条回答
  •  闹比i
    闹比i (楼主)
    2021-01-22 02:08

    Cython has some support for openmp, but it is probably easier to code in C and to wrap resulting code with Cython if openmp-pragmas are used extensively.


    As alternative, you could use verbatim-C-code and tricks with defines to bring some of the functionality to Cython, but using of pragmas in defines isn't straight forward (_Pragma is a C99-solution, MSVC doing its own thing as always with __pragma), there are some examples as proof of concept for Linux/gcc:

    cdef extern from *:
        """
        #define START_OMP_PARALLEL_PRAGMA() _Pragma("omp parallel") {
        #define END_OMP_PRAGMA() }
        #define START_OMP_SINGLE_PRAGMA() _Pragma("omp single") {
        #define START_OMP_CRITICAL_PRAGMA() _Pragma("omp critical") {   
        """
        void START_OMP_PARALLEL_PRAGMA() nogil
        void END_OMP_PRAGMA() nogil
        void START_OMP_SINGLE_PRAGMA() nogil
        void START_OMP_CRITICAL_PRAGMA() nogil
    

    we make Cython believe, that START_OMP_PARALLEL_PRAGMA() and Co. are nogil-function, so it put them into C-code and thus they get pick up by the preprocessor.

    We must use the syntax

    #pragma omp single{
       //do_something
    }
    

    and not

    #pragma omp single
    do_something
    

    because of the way Cython generates C-code.

    The usage could look as follows (I'm avoiding here from cython.parallel.parallel as it does too much magic for this simple example):

    %%cython -c=-fopenmp --link-args=-fopenmp
    cdef extern from *:# as listed above
        ...
    
    def test_omp():
        cdef int a=0
        cdef int b=0  
        with nogil:
            START_OMP_PARALLEL_PRAGMA()
            START_OMP_SINGLE_PRAGMA()
            a+=1
            END_OMP_PRAGMA()
            START_OMP_CRITICAL_PRAGMA()
            b+=1
            END_OMP_PRAGMA() # CRITICAL
            END_OMP_PRAGMA() # PARALLEL
        print(a,b)
    

    Calling test_omp prints "1 2" on my machine with 2 threads, as expected (one could change the number of threads using openmp.omp_set_num_threads(10)).

    However, the above is still very brittle - some error checking by Cython can lead to invalid code (Cython uses goto for control flow and it is not possible to jump out of openmp-block). Something like this happens in your example:

    cimport numpy as np
    import numpy as np
    def test_omp2():
        cdef np.int_t[:] a=np.zeros(1,dtype=int)
    
        START_OMP_SINGLE_PRAGMA()
        a[0]+=1
        END_OMP_PRAGMA()
    
        print(a)
    

    Because of bounding checking, Cython will produce:

    START_OMP_SINGLE_PRAGMA();
    ...
    //check bounds:
    if (unlikely(__pyx_t_6 != -1)) {
        __Pyx_RaiseBufferIndexError(__pyx_t_6);
        __PYX_ERR(0, 30, __pyx_L1_error)  // HERE WE GO A GOTO!
    }
    ...
    END_OMP_PRAGMA();
    

    In this special case setting boundcheck to false, i.e.

    cimport cython
    @cython.boundscheck(False) 
    def test_omp2():
       ...
    

    would solve the issue for the above example, but probably not in general.

    Once again: using openmp in C (and wrapping the functionality with Cython) is a more enjoyable experience.


    As a side note: Python-threads (the ones governed by GIL) and openmp-threads are different and know nothing about eachother. The above example would also work (compile and run) correctly without releasing the GIL - openmp-threads do not care about GIL, but as there are no Python-objects involved nothing can go wrong. Thus I have added nogil to the wrapped "functions", so it can also be used in nogil blocks.

    However, when code gets more complicated it becomes less obvious, that the variables shared between different Python-threads aren't accessed (all above because those accesses could happen in the generated C-code and this doesn't become clear from the Cython-code), it might be wiser not to release gil, while using openmp.

    0 讨论(0)
提交回复
热议问题