Python multiprocessing apply_async “assert left > 0” AssertionError

前端 未结 2 851
轻奢々
轻奢々 2021-01-06 23:42

I am trying to load numpy files asynchronously in a Pool:

self.pool = Pool(2, maxtasksperchild = 1)
...
nextPackage = self.pool.apply_async(loadPackages, (..         


        
2条回答
  •  别那么骄傲
    2021-01-07 00:21

    It think I've found a workaround by retrieving data in small chunks. In my case it was a list of lists.

    I had:

    for i in range(0, NUMBER_OF_THREADS):
        print('MAIN: Getting data from process ' + str(i) + ' proxy...')
        X_train.extend(ListasX[i]._getvalue())
        Y_train.extend(ListasY[i]._getvalue())
        ListasX[i] = None
        ListasY[i] = None
        gc.collect()
    

    Changed to:

    CHUNK_SIZE = 1024
    for i in range(0, NUMBER_OF_THREADS):
        print('MAIN: Getting data from process ' + str(i) + ' proxy...')
        for k in range(0, len(ListasX[i]), CHUNK_SIZE):
            X_train.extend(ListasX[i][k:k+CHUNK_SIZE])
            Y_train.extend(ListasY[i][k:k+CHUNK_SIZE])
        ListasX[i] = None
        ListasY[i] = None
        gc.collect()
    

    And now it seems to work, possibly by serializing less data at a time. So maybe if you can segment your data into smaller portions you can overcome the issue. Good luck!

提交回复
热议问题