I am building a script to download and parse benefits information for health insurance plans on Obamacare exchanges. Part of this requires downloading and parsing the plan benef
If you use the standard module “concurrent.futures” and want to simultaneously process several million data, then a queue of workers will take up all the free memory.
You can use bounded-pool-executor. https://github.com/mowshon/bounded_pool_executor
pip install bounded-pool-executor
example:
from bounded_pool_executor import BoundedProcessPoolExecutor
from time import sleep
from random import randint
def do_job(num):
sleep_sec = randint(1, 10)
print('value: %d, sleep: %d sec.' % (num, sleep_sec))
sleep(sleep_sec)
with BoundedProcessPoolExecutor(max_workers=5) as worker:
for num in range(10000):
print('#%d Worker initialization' % num)
worker.submit(do_job, num)