Same output in different workers in multiprocessing

后端 未结 2 1828
耶瑟儿~
耶瑟儿~ 2020-11-27 07:47

I have very simple cases where the work to be done can be broken up and distributed among workers. I tried a very simple multiprocessing example from here:

i         


        
相关标签:
2条回答
  • 2020-11-27 08:03

    This blog post provides an example of a good and bad practise when using numpy.random and multi-processing. The more important is to understand when the seed of your pseudo random number generator (PRNG) is created:

    import numpy as np
    import pprint
    from multiprocessing import Pool
    
    pp = pprint.PrettyPrinter()
    
    def bad_practice(index):
        return np.random.randint(0,10,size=10)
    
    def good_practice(index):
        return np.random.RandomState().randint(0,10,size=10)
    
    p = Pool(5)
    
    pp.pprint("Bad practice: ")
    pp.pprint(p.map(bad_practice, range(5)))
    pp.pprint("Good practice: ")
    pp.pprint(p.map(good_practice, range(5)))
    

    output:

    'Bad practice: '
    [array([4, 2, 8, 0, 1, 1, 6, 1, 2, 9]),
     array([4, 2, 8, 0, 1, 1, 6, 1, 2, 9]),
     array([4, 2, 8, 0, 1, 1, 6, 1, 2, 9]),
     array([4, 2, 8, 0, 1, 1, 6, 1, 2, 9]),
     array([4, 2, 8, 0, 1, 1, 6, 1, 2, 9])]
    'Good practice: '
    [array([8, 9, 4, 5, 1, 0, 8, 1, 5, 4]),
     array([5, 1, 3, 3, 3, 0, 0, 1, 0, 8]),
     array([1, 9, 9, 9, 2, 9, 4, 3, 2, 1]),
     array([4, 3, 6, 2, 6, 1, 2, 9, 5, 2]),
     array([6, 3, 5, 9, 7, 1, 7, 4, 8, 5])]
    

    In the good practice the seed is created once per thread while in the bad practise the seed is created only once when you import the numpy.random module.

    0 讨论(0)
  • 2020-11-27 08:08

    I think you'll need to re-seed the random number generator using numpy.random.seed in your do_calculation function.

    My guess is that the random number generator (RNG) gets seeded when you import the module. Then, when you use multiprocessing, you fork the current process with the RNG already seeded -- Thus, all your processes are sharing the same seed value for the RNG and so they'll generate the same sequences of numbers.

    e.g.:

    def do_calculation(data):
        np.random.seed()
        rand=np.random.randint(10)
        print data, rand
        return data * 2
    
    0 讨论(0)
提交回复
热议问题