Running Python script parallel

后端 未结 1 642
忘了有多久
忘了有多久 2021-02-04 14:45

I have a huge dataset of videos that I process using a python script called process.py. The problem is it takes a lot of time to process all the dataset which conta

相关标签:
1条回答
  • 2021-02-04 15:04

    The multiprocessing documentation ( https://docs.python.org/2/library/multiprocessing.html) is actually fairly easy to digest. This section (https://docs.python.org/2/library/multiprocessing.html#using-a-pool-of-workers) should be particularly relevant

    You definitely do not need multiple copy of the same script. This is an approach you can adopt:

    Assume it is the general structure of your existing script (process.py).

    def convert_vid(fname):
        # do the heavy lifting
        # ...
    
    if __name__ == '__main__':
       # There exists VIDEO_SET_1 to 4, as mentioned in your question
       for file in VIDEO_SET_1:  
           convert_vid(file)
    

    With multiprocessing, you can fire the function convert_vid in seperate processes. Here is the general scheme:

    from multiprocessing import Pool
    
    def convert_vid(fname):
        # do the heavy lifting
        # ...
    
    if __name__ == '__main__':
       pool = Pool(processes=4) 
       pool.map(convert_vid, [VIDEO_SET_1, VIDEO_SET_2, VIDEO_SET_3, VIDEO_SET_4]) 
    
    0 讨论(0)
提交回复
热议问题