Multiprocess multiple files in a list

前端 未结 1 551
伪装坚强ぢ
伪装坚强ぢ 2020-12-16 07:40

I am trying to read a list that contains N number of .csv files stored in a list synchronously.

Right now I do the following:

import multiprocess

相关标签:
1条回答
  • 2020-12-16 07:58

    I'm guessing here at your request, because the original question is quite unclear. Since os.listdir doesn't guarantee an ordering, I'm assuming your "two" functions are actually identical and you just need to perform the same process on multiple files simultaneously.

    The easiest way to do this, in my experience, is to spin up a Pool, launch a process for each file, and then wait. e.g.

    import multiprocessing
    
    def process(file):
        pass # do stuff to a file
    
    p = multiprocessing.Pool()
    for f in glob.glob(folder+"*.csv"):
        # launch a process for each file (ish).
        # The result will be approximately one process per CPU core available.
        p.apply_async(process, [f]) 
    
    p.close()
    p.join() # Wait for all child processes to close.
    
    0 讨论(0)
提交回复
热议问题