Memory usage with concurrent.futures.ThreadPoolExecutor in Python3

前端 未结 4 1399
盖世英雄少女心
盖世英雄少女心 2021-02-05 21:42

I am building a script to download and parse benefits information for health insurance plans on Obamacare exchanges. Part of this requires downloading and parsing the plan benef

4条回答
  •  臣服心动
    2021-02-05 22:37

    It's not your fault. as_complete() doesn't release its futures until it completes. There's a issue logged already: https://bugs.python.org/issue27144

    For now, I think the majority approach is to wrap as_complete() inside another loop that chunkify to a sane number of futures, depending on how much RAM you want to spend and how big your result will be. It'll block on each chunk until all job is gone before going to next chunk so be slower or potentially stuck in the middle for a long time, but I see no other way for now, though will keep this answer posted when there's a smarter way.

提交回复
热议问题