python multiprocessing in Jupyter on Windows: AttributeError: Can't get attribute “abc”

前端 未结 4 1555
夕颜
夕颜 2020-12-06 17:09

I am trying to run a simple command that guesses gender by name using multiprocessing. This code worked on a previous machine so perhaps my setup had something to do with it

相关标签:
4条回答
  • 2020-12-06 17:18

    I got multiprocessing to work from within a Jupyter notebook on Windows by saving my function in a separate .py file and including that file in my notebook.

    Example:

    f.py:

    def f(name, output):
      output.put('hello {0}'.format(name))
      return
    

    Code in Jupyter notebook:

    from multiprocessing import Process, Queue
    
    #Having the function definition here results in
    #AttributeError: Can't get attribute 'f' on <module '__main__' (built-in)>
    
    #The solution seems to be importing the function from a separate file.
    
    import f
    
    #Also, the original version of f only had a print statement in it.  
    #That doesn't work with Process - in the sense that it prints to the console 
    #instead of the notebook.
    #The trick is to let f write the string to print into an output-queue.
    #When Process is done, the result is retrieved from the queue and printed.
    
    if __name__ == '__main__':    
    
       # Define an output queue
       output=Queue()
    
       # Setup a list of processes that we want to run
       p = Process(target=f.f, args=('Bob',output))
    
       # Run process
       p.start()
    
       # Exit the completed process
       p.join()
    
       # Get process results from the output queue
       result = output.get(p)
    
       print(result)
    

    I'm a Python newby and I may have missed all sorts of details, but this works for me.

    0 讨论(0)
  • 2020-12-06 17:23

    This problem would be headache for people using Jupyter on windows. The code would run fine on Linux system.

    In order to run the code on windows,

    1. Put the function definition in a separate ipynb file.
    2. import the file using from ipynb.fs.full.functions import func ( make sure you pip install ipynb first)
    3. This would definitely solve this.
    0 讨论(0)
  • 2020-12-06 17:32

    After much research it appears that multiprocessing is not an option to use in a notebook on windows. I am closing but please open if you have a solution. I will switch over to pathos.

    0 讨论(0)
  • 2020-12-06 17:41

    How about this:

    Code:

    #!/usr/bin/env python3
    
    import sys
    import time
    import gender_guesser.detector as gender
    import pandas as pd
    import multiprocessing as mp
    
    d = gender.Detector()
    
    def guess_gender(name):
        n = name.title()
        g = d.get_gender(n)
        return g
    
    def run():
        ls = ['john','joe','amamda','derick','peter','ashley','john',\
              'joe','amamda','derick','peter','ashley']
    
        num_cpus = mp.cpu_count() - 1
        pool = mp.Pool(processes=num_cpus)
        result = pool.map(guess_gender, ls)
    
        df = pd.DataFrame(result, columns=["gender"])
    
        print("\ntook {} secs to classify\n".format(str(time.time() - st)))
        print(df)  # or you could save the dataframe using .to_csv()
    
    st = time.time()
    
    if __name__ == "__main__":
        run()
    

    Output:

    took 0.0150408744812 secs to classify
    
               gender
    0            male
    1            male
    2         unknown
    3            male
    4            male
    5   mostly_female
    6            male
    7            male
    8         unknown
    9            male
    10           male
    11  mostly_female
    
    0 讨论(0)
提交回复
热议问题