问题
I have the following simple Cython function for a parallel reduction:
# cython: boundscheck = False
# cython: initializedcheck = False
# cython: wraparound = False
# cython: cdivision = True
# cython: language_level = 3
from cython.parallel import parallel, prange
cpdef double simple_reduction(int n, int num_threads):
cdef int i
cdef int sum = 0
for i in prange(n, nogil=True, num_threads=num_threads):
sum += 1
return sum
Which horrifyingly returns the following:
In [3]: simple_reduction(n=10, num_threads=1)
Out[3]: 10.0
In [4]: simple_reduction(n=10, num_threads=2)
Out[4]: 20.0
In [5]: simple_reduction(n=10, num_threads=3)
Out[5]: 30.0
In other words, it appears to be repeating all n iterates of the loop for each thread instead of parallelizing the iterates over each thread. Any idea what's going?
I am using Python 3.7.1 and Cython 0.29.2 on macOS Mojave 10.14.3.
UPDATE: Here's my setup.py file:
from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext
from Cython.Build import cythonize
import os
import sys
if sys.platform == 'darwin':
os.environ['CC'] = 'gcc-8'
os.environ['CXX'] = 'g++-8'
EXT_MODULES = [Extension('foo', ['foo.pyx'],
extra_compile_args=['-fopenmp'],
extra_link_args=['-fopenmp'])]
setup(name='foo',
ext_modules=cythonize(EXT_MODULES))
I have installed GCC separately and have to set the environment variables 'CC' and 'CXX' when using OSX to avoid the problem of OSX aliasing those clang.
回答1:
I fixed this bug by first installing gcc using Anaconda:
conda install gcc
Then changing the lines in setup.py to use that new compiler:
if sys.platform == 'darwin':
os.environ['CC'] = '/anaconda3/bin/gcc'
os.environ['CXX'] = '/anaconda3/bin/g++'
Using Anaconda gcc (instead of the brew-installed one I was using originally) didn't fix the problem right away. It wouldn't compile due to the following bug:
/anaconda3/envs/python36/lib/gcc/x86_64-apple-darwin11.4.2/4.8.5/include-fixed/limits.h:168:61: fatal error: limits.h: No such file or directory #include_next /* recurse down to the real one */
The problem here has to due with macOS 10.14 and XCode 10.0. However the solution given by @Maxxx in this related question worked for me. After installing the .pkg hidden in the command line tool directory
/Library/Developer/CommandLineTools/Packages/macOS_SDK_headers_for_macOS_10.14.pkg
the code compiled and the parallelism worked as it was supposed to.
UPDATE: After updating to OSX Catalina, this fix no longer works because the .pkg file above no longer exists. I found a new solution from reading this related question. In my case, exporting the following path to CPATH fixed the problem.
export CPATH=~/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include
来源:https://stackoverflow.com/questions/54776301/cython-prange-is-repeating-not-parallelizing