scikit-learn's GridSearchCV stops working when n_jobs>1

前端 未结 2 1673
渐次进展
渐次进展 2020-12-31 23:54

I have previously asked here come up with following lines of code:

parameters = [{\'weights\': [\'uniform\'], \'n_neighbors\': [5, 10, 20, 30, 40, 50, 60, 70         


        
相关标签:
2条回答
  • 2021-01-01 00:17

    That worked perfectly for me as well (upgrading was a bit of a drag but this was the only fix, of many attempted, that worked in my case). For any other ipython notebook users out there, the best way to work this in is to add it to the notebook configuration (you'll get an error trying to run it straight in a notebook). The commands can be added like this:

    # in ipython_notebook_config.py
    c.IPKernelApp.exec_lines = ['import multiprocessing', 'multiprocessing.set_start_method("forkserver")']
    
    0 讨论(0)
  • 2021-01-01 00:30

    libdispatch.dylib from Grand Central Dispatch is used internally by OSX's builtin implementation of BLAS called Accelerate when you do a numpy.dot calls. The GCD runtime does not work when programs call the POSIX fork syscall without using an exec syscall afterwards and therefore makes all Python programs that use the multiprocessing module prone to crash. sklearn's GridsearchCV uses the Python multiprocessing module for parallelization.

    Under Python 3.4 and later you can force Python multiprocessing to use the forkserver start method instead of the default fork mode to workaround this problem, for instance at the beginning of the main file of your program:

    if __name__ == "__main__":
        import multiprocessing as mp; mp.set_start_method('forkserver')
    

    Alternatively, you can rebuild numpy from source and make it link against ATLAS or OpenBLAS instead of OSX Accelerate. The numpy developers are working on binary distributions that include either ATLAS or OpenBLAS by default.

    0 讨论(0)
提交回复
热议问题