Parallel in-place sort for numpy arrays

后端 未结 2 1163
感动是毒
感动是毒 2021-02-14 08:19

I often need to sort large numpy arrays (few billion elements), which became a bottleneck of my code. I am looking for a way to parallelize it.

Are there any parallel i

2条回答
  •  灰色年华
    2021-02-14 09:04

    I ended up wrapping GCC parallel sort. Here is the code:

    parallelSort.pyx

    # cython: wraparound = False
    # cython: boundscheck = False
    import numpy as np
    cimport numpy as np
    import cython
    cimport cython 
    
    ctypedef fused real:
        cython.char
        cython.uchar
        cython.short
        cython.ushort
        cython.int
        cython.uint
        cython.long
        cython.ulong
        cython.longlong
        cython.ulonglong
        cython.float
        cython.double
    
    cdef extern from "" namespace "__gnu_parallel":
        cdef void sort[T](T first, T last) nogil 
    
    def numpyParallelSort(real[:] a):
        "In-place parallel sort for numpy types"
        sort(&a[0], &a[a.shape[0]])
    

    Extra compiler args: -fopenmp (compile) and -lgomp (linking)

    This makefile will do it:

    all:
        cython --cplus parallelSort.pyx  
        g++  -g -march=native -Ofast -fpic -c    parallelSort.cpp -o parallelSort.o -fopenmp `python-config --includes`
        g++  -g -march=native -Ofast -shared  -o parallelSort.so parallelSort.o `python-config --libs` -lgomp 
    
    clean:
        rm -f parallelSort.cpp *.o *.so
    

    And this shows that it works:

    from parallelSort import numpyParallelSort
    import numpy as np 
    a = np.random.random(100000000)
    
    numpyParallelSort(a) 
    print a[:10]
    

    edit: fixed bug noticed in the comment below

提交回复
热议问题