How to create a progress bar in command line for pool processes?

夙愿已清 提交于 2021-02-07 10:56:48


I have several scripts which I run using Multiprocessing pool I am trying to do a progress bar based on the scripts completed.

I checked

but I cannot figure out how I can combine the scripts completed in the counter

import os                                                                       
from multiprocessing import Pool

def run_process(process):                                                             
    os.system('python {}'.format(process))

processes = ('', '','','')

if __name__ == "__main__":

    pool = Pool(processes=2), processes)


Here's a slightly differrent approach which uses concurrent.futures.ThreadPoolExecutor instead of a multiprocessing.Pool which make it simpler and more efficient than what's in my other answer.

Note it uses the same module that's in my other answer.

import concurrent.futures
import os
import subprocess
import sys

from print_progress_bar import print_progress_bar
progress_bar_kwargs = dict(prefix='Progress:', suffix='Complete', length=40)

# To simplify testing just using one script multiple times.
processes = ('./mp_scripts/', './mp_scripts/',
             './mp_scripts/', './mp_scripts/')
process_count = 0

def run_process(process):
    global process_count[sys.executable,  process])
    # Update process count and progress bar when it's done.
    process_count += 1
    print_progress_bar(process_count, len(processes), **progress_bar_kwargs)

print_progress_bar(0, len(processes), **progress_bar_kwargs) # Print 0% progress.
with concurrent.futures.ThreadPoolExecutor(max_workers=2) as executor:

    future_to_process = {executor.submit(run_process, process): process
                            for process in processes}
    for future in concurrent.futures.as_completed(future_to_process):
        process = future_to_process[future]
            _ = future.result()
        except Exception as exc:
            print(f'{process} generated an exception: {exc}')



You can do it by using pool.apply_async() because it supports a callback function that can be used to know when the target function has returned.

I used @Greenstick's answer to display the progress bar, but I modified it mostly to conform to PEP-8 coding guidelines and placed it in a separate module named print_progress_bar — see below.

Performance note: While one can use multiprocessing.Pool to do this — I strongly suspect the code in your question is a verbatim copy of what's in the article How to run parallel processes — doing so is extremely inefficient because each process will initialize its own Python interpreter double the number of times really necessary. First to execute the run_process() function itself, and then again to run the script process.

Spawning processes involves a fair amount of overhead. That overhead can be mitigated by instead running run_process() as a separate thread in the current process, which is a lighter-weight.

Switching to a ThreadPool is very easy, just change the line:
    from multiprocessing import Pool
    from multiprocessing.pool import ThreadPool as Pool

Alternatively you can use a concurrent.futures.ThreadPoolExecutor as shown in my other answer.

import os
from multiprocessing import Pool
import subprocess
import sys

from print_progress_bar import print_progress_bar
progress_bar_kwargs = dict(prefix='Progress:', suffix='Complete', length=40)

def run_process(process):
     os.system('{} {}'.format(sys.executable,  process))

def callback(_):
    """Update process count and progress bar."""
    global process_count
    process_count += 1
    print_progress_bar(process_count, len(processes), **progress_bar_kwargs)

# To simplify testing just using one script multiple times.
processes = ('./mp_scripts/', './mp_scripts/',
             './mp_scripts/', './mp_scripts/')
process_count = 0

if __name__ == '__main__':

    print_progress_bar(0, len(processes), **progress_bar_kwargs) # Print 0% progress.

    with Pool(processes=2) as pool:
        results = []
        for process in processes:
            r = pool.apply_async(run_process, (process,), {}, callback)

        while results:  # Processes still running?
            results = [r for r in results if not r.ready()]


# from
def print_progress_bar(iteration, total, prefix='', suffix='', decimals=1, length=100,
                       fill='█', print_end="\r"):
    """ Print iterations progress.

        Call in a loop to create terminal progress bar
            iteration   - Required  : current iteration (Int)
            total       - Required  : total iterations (Int)
            prefix      - Optional  : prefix string (Str)
            suffix      - Optional  : suffix string (Str)
            decimals    - Optional  : positive number of decimals in percent complete (Int)
            length      - Optional  : character length of bar (Int)
            fill        - Optional  : bar fill character (Str)
            print_end   - Optional  : end character (e.g. "\r", "\r\n") (Str)
    percent = ("{0:." + str(decimals) + "f}").format(100 * (iteration / float(total)))
    filledLength = int(length * iteration // total)
    bar = fill * filledLength + '-' * (length - filledLength)
    print('\r%s |%s| %s%% %s' % (prefix, bar, percent, suffix), end=print_end, flush=True)

    if iteration == total:  # Print newline on completion.

