“Can't pickle ” error when using multiprocessing on Windows

后端 未结 2 1914
悲哀的现实
悲哀的现实 2021-01-12 11:55

I\'m writing a multiprocessing program to handle a large .CSV file in parallel, using Windows.

I found this excellent example for a similar problem. When running it

2条回答
  •  清酒与你
    2021-01-12 12:25

    Since multiprocessing depends on serializing and de-serializing objects when passing then as parameters between process, and your code relies on passing an instance of CSVWorker around the process (the instance denoted as 'self') you got this error - as both csv readers and open files can be pickled.

    You mentioned your CSV are large, I don't think reading all data into a list would be a solution for you - so you have to think of a way of passing one line from your input CSV to each worker at once, and retrieving a processed line from each worker , and perform all the I/O on the main process.

    It looks like multiprocessing.Pool will be a better way of writing your aplication - Check multiprocessing documentation at http://docs.python.org/library/multiprocessing.html - and try using a process pool, and pool.map to process your CSV's. It also takes care of preserving the order - which will elimnate a lot of the complicated logic on your code.

提交回复
热议问题