Execute batch of promises in series. Once Promise.all is done go to the next batch

后端 未结 6 1867
星月不相逢
星月不相逢 2021-02-19 15:57

I have an array that contains an array of promises, and each inner array could have either 4k, 2k or 500 promises.

In total there are around 60k promises and I may test

相关标签:
6条回答
  • 2021-02-19 16:35

    @jfriend00 Just adding to your answer using async/await with reduce:

    function runPromisesInSeries(bigArray, getInfoForEveryInnerArgument) {
      try {
        return bigArray.reduce(async (acc, cItem) => {
          const results = await acc
          const data = await getInfoForEveryInnerArgument(cItem)
          results.push(data)
          return results
        }, Promise.resolve([]))
      } catch (err) {
        throw err
      }
    }
    
    0 讨论(0)
  • 2021-02-19 16:36

    You can do it recursively, for example here I needed to put about 60k documents in mongo, but it was too big, to do it in one step, therefore I take 1k documents, send them to the mongo, after it is finished I take another 1k documents etc.

    exports.rawRecursive = (arr, start) => {
            //ending condition
            if (start > arr.length) {
                return;
            }
    
            Rawmedicament.insertManyAsync(_.slice(arr, start, start + 1000)).then(() => {
                //recursive
                exports.rawRecursive(arr, start + 1000);
            });
    };
    

    If you want to notice, when everything is done, you can in ending condition put callback or if you like Promises you can call resolve() there.

    0 讨论(0)
  • 2021-02-19 16:37

    An answer from October 2020. Async/await makes it short: only 10 code lines+JSDoc.

    /**
     * Same as Promise.all(), but it waits for the first {batchSize} promises to finish
     * before starting the next batch.
     *
     * @template A
     * @template B
     * @param {function(A): B} task The task to run for each item.
     * @param {A[]} items Arguments to pass to the task for each call.
     * @param {int} batchSize
     * @returns {B[]}
     */
    async promiseAllInBatches(task, items, batchSize) {
        let position = 0;
        let results = [];
        while (position < items.length) {
            const itemsForBatch = items.slice(position, position + batchSize);
            results = [...results, ...await Promise.all(itemsForBatch.map(item => task(item)))];
            position += batchSize;
        }
        return results;
    }
    
    0 讨论(0)
  • 2021-02-19 16:41

    Your question is a bit misnamed which may have confused some folks in this question and in the previous version of this question. You are trying to execute a batch of async operations in series, one batch of operations, then when that is done execute another batch of operations. The results of those async operations are tracked with promises. Promises themselves represent async operations that have already been started. "Promises" aren't executed themselves. So technically, you don't "execute a batch of promises in series". You execute a set of operations, track their results with promises, then execute the next batch when the first batch is all done.

    Anyway, here's a solution to serializing each batch of operations.

    You can create an inner function which I usually call next() that lets you process each iteration. When the promise resolves from processing one innerArray, you call next() again:

    function mainFunction() {
        return new Promise(function(resolve, reject) {
            var bigArray = [[argument1, argument2, argument3, argument4], [argument5, argument6, argument7, argument8], ....];
            //the summ of all arguments is over 60k...
            var results = [];
    
            var index = 0;
            function next() {
                if (index < bigArray.length) {
                    getInfoForEveryInnerArgument(bigArray[index++]).then(function(data) {
                        results.push(data);
                        next();
                    }, reject);
                } else {
                    resolve(results);
                }
            }
            // start first iteration
            next();
        });
    }
    

    This also collects all the sub-results into a results array and returns a master promise who's resolved value is this results array. So, you could use this like:

    mainFunction().then(function(results) {
        // final results array here and everything done
    }, function(err) {
        // some error here
    });
    

    You could also use the .reduce() design pattern for iterating an array serially:

    function mainFunction() {
        var bigArray = [[argument1, argument2, argument3, argument4], [argument5, argument6, argument7, argument8], ....];
        return bigArray.reduce(function(p, item) {
            return p.then(function(results) {
                return getInfoForEveryInnerArgument(item).then(function(data) {
                    results.push(data);
                    return results;
                })
            });
        }, Promise.resolve([]));
    }
    

    This creates more simultaneous promises than the first option and I don't know if that is an issue for such a large set of promises (which is why I offered the original option), but this code is cleaner and the concept is convenient to use for other situations too.


    FYI, there are some promise add-on features built for doing this for you. In the Bluebird promise library (which is a great library for development using promises), they have Promise.map() which is made for this:

    function mainFunction() {
        var bigArray = [[argument1, argument2, argument3, argument4], [argument5, argument6, argument7, argument8], ....];
        return Promise.map(bigArray, getInfoForEveryInnerArgument);
    
    }
    
    0 讨论(0)
  • 2021-02-19 16:43

    In addition, if original array is not of promises but of objects that should be processed, batch processing can be done without an external dependency using combination of Array.prototype.map(), Array.prototype.slice() and Promise.all():

    // Main batch parallelization function.
    function batch(tasks, pstart, atonce, runner, pos) {
      if (!pos) pos = 0;
      if (pos >= tasks.length) return pstart;
      var p = pstart.then(function() {
        output('Batch:', pos / atonce + 1);
        return Promise.all(tasks.slice(pos, pos + atonce).map(function(task) {
          return runner(task);
        }));
      });
      return batch(tasks, p, atonce, runner, pos + atonce);
    }
    
    // Output function for the example
    function output() {
      document.getElementById("result").innerHTML += Array.prototype.slice.call(arguments).join(' ') + "<br />";
      window.scrollTo(0, document.body.scrollHeight);
    }
    
    /*
     * Example code.
     * Note: Task runner should return Promise.
     */
    function taskrunner(task) {
      return new Promise(function(resolve, reject) {
        setTimeout(function() {
          output('Processed:', task.text, 'Delay:', task.delay);
          resolve();
        }, task.delay);
      });
    }
    
    var taskarray = [];
    function populatetasks(size) {
      taskarray = [];
      for (var i = 0; i < size; i++) {
        taskarray.push({
          delay: 500 + Math.ceil(Math.random() * 50) * 10,
          text: 'Item ' + (i + 1)
        });
      }
    }
    
    function clean() {
      document.getElementById("result").innerHTML = '';
    }
    
    var init = Promise.resolve();
    function start() {
      var bsize = parseInt(document.getElementById("batchsize").value, 10),
        tsize = parseInt(document.getElementById("taskssize").value, 10);
      populatetasks(tsize);
      init = batch(taskarray.slice() /*tasks array*/ , init /*starting promise*/ , bsize /*batch size*/ , taskrunner /*task runner*/ );
    }
    <input type="button" onclick="start()" value="Start" />
    <input type="button" onclick="clean()" value="Clear" />&nbsp;Batch size:&nbsp;
    <input id="batchsize" value="4" size="2"/>&nbsp;Tasks:&nbsp;
    <input id="taskssize" value="10" size="2"/>
    <pre id="result" />

    0 讨论(0)
  • 2021-02-19 17:00

    Dynamically batching more promises

    A simple implementation where you can have a queue of tasks batched to run in parallel and add more dynamically:

    class TaskQueue {
      constructor ({
        makeTask,
        initialData = [],
        getId = data => data.id,
        batchSize = 15,
        onComplete = () => {},
      }) {
        if (!makeTask) throw new Error('The "makeTask" parameter is required');
    
        this.makeTask = makeTask;
        this.getId = getId;
        this.batchSize = batchSize;
        this.onComplete = onComplete;
        this.queue = new Map();
    
        this.add(initialData);
      }
    
      add(...data) {
        data.forEach(item => {
          const id = this.getId(item);
          if (this.queue.has(id)) return;
    
          this.queue.set(id, item);
        });
    
        // running automatically on create or additional items added
        this.runNextBatch();
      }
    
      runNextBatch () {
        if (this.queueStarted) return;
        if (this.queue.size === 0) return;
    
        this.queueStarted = true;
        const currentBatchData = Array.from(this.queue.values()).slice(0, this.batchSize);
    
        const tasks = currentBatchData.map(data => {
          const id = this.getId(data);
    
          // Have some error handling implemented in `makeTask`
          this.makeTask(data)
            .finally(() => this.queue.delete(id));
        });
    
        return Promise.all(tasks)
          .then(() => {
            this.queueStarted = false;
            this.runNextBatch();
          })
          .finally(() => {
            this.queueStarted = false;
            if (this.queue.size === 0) this.onComplete();
          });
      }
    }
    
    // Usage
    const lotOfFilesForUpload = [{ uri: 'file://some-path' }, { uri: 'file://some-other-path' }];
    
    const upload = (file) => console.log('fake uploading file: ', file);
    
    const taskQueue = new TaskQueue({
      initialData: lotOfFilesForUpload,
      getId: file => file.uri,
      makeTask: file => upload(file),
      onComplete: () => console.log('Queue completed'),
    });
    
    // You can add more tasks dynamically
    taskQueue.add({ uri: 'file://yet-another-file' });
    
    
    0 讨论(0)
提交回复
热议问题