What is the best way to limit concurrency when using ES6's Promise.all()?

前端 未结 17 607
执念已碎
执念已碎 2020-11-29 21:28

I have some code that is iterating over a list that was queried out of a database and making an HTTP request for each element in that list. That list can sometimes be a rea

相关标签:
17条回答
  • 2020-11-29 21:45

    Here is my ES7 solution to a copy-paste friendly and feature complete Promise.all()/map() alternative, with a concurrency limit.

    Similar to Promise.all() it maintains return order as well as a fallback for non promise return values.

    I also included a comparison of the different implementation as it illustrates some aspects a few of the other solutions have missed.

    Usage

    const asyncFn = delay => new Promise(resolve => setTimeout(() => resolve(), delay));
    const args = [30, 20, 15, 10];
    await asyncPool(args, arg => asyncFn(arg), 4); // concurrency limit of 4
    

    Implementation

    async function asyncBatch(args, fn, limit = 8) {
      // Copy arguments to avoid side effects
      args = [...args];
      const outs = [];
      while (args.length) {
        const batch = args.splice(0, limit);
        const out = await Promise.all(batch.map(fn));
        outs.push(...out);
      }
      return outs;
    }
    
    async function asyncPool(args, fn, limit = 8) {
      return new Promise((resolve) => {
        // Copy arguments to avoid side effect, reverse queue as
        // pop is faster than shift
        const argQueue = [...args].reverse();
        let count = 0;
        const outs = [];
        const pollNext = () => {
          if (argQueue.length === 0 && count === 0) {
            resolve(outs);
          } else {
            while (count < limit && argQueue.length) {
              const index = args.length - argQueue.length;
              const arg = argQueue.pop();
              count += 1;
              const out = fn(arg);
              const processOut = (out, index) => {
                outs[index] = out;
                count -= 1;
                pollNext();
              };
              if (typeof out === 'object' && out.then) {
                out.then(out => processOut(out, index));
              } else {
                processOut(out, index);
              }
            }
          }
        };
        pollNext();
      });
    }
    

    Comparison

    // A simple async function that returns after the given delay
    // and prints its value to allow us to determine the response order
    const asyncFn = delay => new Promise(resolve => setTimeout(() => {
      console.log(delay);
      resolve(delay);
    }, delay));
    
    // List of arguments to the asyncFn function
    const args = [30, 20, 15, 10];
    
    // As a comparison of the different implementations, a low concurrency
    // limit of 2 is used in order to highlight the performance differences.
    // If a limit greater than or equal to args.length is used the results
    // would be identical.
    
    // Vanilla Promise.all/map combo
    const out1 = await Promise.all(args.map(arg => asyncFn(arg)));
    // prints: 10, 15, 20, 30
    // total time: 30ms
    
    // Pooled implementation
    const out2 = await asyncPool(args, arg => asyncFn(arg), 2);
    // prints: 20, 30, 15, 10
    // total time: 40ms
    
    // Batched implementation
    const out3 = await asyncBatch(args, arg => asyncFn(arg), 2);
    // prints: 20, 30, 20, 30
    // total time: 45ms
    
    console.log(out1, out2, out3); // prints: [30, 20, 15, 10] x 3
    
    // Conclusion: Execution order and performance is different,
    // but return order is still identical
    

    Conclusion

    asyncPool() should be the best solution as it allows new requests to start as soon as any previous one finishes.

    asyncBatch() is included as a comparison as its implementation is simpler to understand, but it should be slower in performance as all requests in the same batch is required to finish in order to start the next batch.

    In this contrived example, the non-limited vanilla Promise.all() is of course the fastest, while the others could perform more desirable in a real world congestion scenario.

    Update

    The async-pool library that others have already suggested is probably a better alternative to my implementation as it works almost identically and has a more concise implementation with a clever usage of Promise.race(): https://github.com/rxaviers/async-pool/blob/master/lib/es7.js

    Hopefully my answer can still serve an educational value.

    0 讨论(0)
  • 2020-11-29 21:46

    Using Array.prototype.splice

    while (funcs.length) {
      // 100 at at time
      await Promise.all( funcs.splice(0, 100).map(f => f()) )
    }
    
    0 讨论(0)
  • 2020-11-29 21:46

    If you know how iterators work and how they are consumed you would't need any extra library, since it can become very easy to build your own concurrency yourself. Let me demonstrate:

    /* [Symbol.iterator]() is equivalent to .values()
    const iterator = [1,2,3][Symbol.iterator]() */
    const iterator = [1,2,3].values()
    
    
    // loop over all items with for..of
    for (const x of iterator) {
      console.log('x:', x)
      
      // notices how this loop continues the same iterator
      // and consumes the rest of the iterator, making the
      // outer loop not logging any more x's
      for (const y of iterator) {
        console.log('y:', y)
      }
    }

    We can use the same iterator and share it across workers.

    If you had used .entries() instead of .values() you would have goten a 2D array with [[index, value]] which i will demonstrate below with a concurrency of 2

    const sleep = t => new Promise(rs => setTimeout(rs, t))
    
    async function doWork(iterator) {
      for (let [index, item] of iterator) {
        await sleep(1000)
        console.log(index + ': ' + item)
      }
    }
    
    const iterator = Array.from('abcdefghij').entries()
    const workers = new Array(2).fill(iterator).map(doWork)
    //    ^--- starts two workers sharing the same iterator
    
    Promise.allSettled(workers).then(() => console.log('done'))

    The benefit of this is that you can have a generator function instead of having everything ready at once.


    Note: the different from this compared to example async-pool is that it spawns two workers, so if one worker throws an error for some reason at say index 5 it won't stop the other worker from doing the rest. So you go from doing 2 concurrency down to 1. (so it won't stop there) So my advise is that you catch all errors inside the doWork function

    0 讨论(0)
  • 2020-11-29 21:48

    I suggest the library async-pool: https://github.com/rxaviers/async-pool

    npm install tiny-async-pool
    

    Description:

    Run multiple promise-returning & async functions with limited concurrency using native ES6/ES7

    asyncPool runs multiple promise-returning & async functions in a limited concurrency pool. It rejects immediately as soon as one of the promises rejects. It resolves when all the promises completes. It calls the iterator function as soon as possible (under concurrency limit).

    Usage:

    const timeout = i => new Promise(resolve => setTimeout(() => resolve(i), i));
    await asyncPool(2, [1000, 5000, 3000, 2000], timeout);
    // Call iterator (i = 1000)
    // Call iterator (i = 5000)
    // Pool limit of 2 reached, wait for the quicker one to complete...
    // 1000 finishes
    // Call iterator (i = 3000)
    // Pool limit of 2 reached, wait for the quicker one to complete...
    // 3000 finishes
    // Call iterator (i = 2000)
    // Itaration is complete, wait until running ones complete...
    // 5000 finishes
    // 2000 finishes
    // Resolves, results are passed in given array order `[1000, 5000, 3000, 2000]`.
    
    0 讨论(0)
  • 2020-11-29 21:50

    So I tried to make some examples shown work for my code, but since this was only for an import script and not production code, using the npm package batch-promises was surely the easiest path for me

    NOTE: Requires runtime to support Promise or to be polyfilled.

    Api batchPromises(int: batchSize, array: Collection, i => Promise: Iteratee) The Promise: Iteratee will be called after each batch.

    Use:

    batch-promises
    Easily batch promises
    
    NOTE: Requires runtime to support Promise or to be polyfilled.
    
    Api
    batchPromises(int: batchSize, array: Collection, i => Promise: Iteratee)
    The Promise: Iteratee will be called after each batch.
    
    Use:
    import batchPromises from 'batch-promises';
     
    batchPromises(2, [1,2,3,4,5], i => new Promise((resolve, reject) => {
     
      // The iteratee will fire after each batch resulting in the following behaviour:
      // @ 100ms resolve items 1 and 2 (first batch of 2)
      // @ 200ms resolve items 3 and 4 (second batch of 2)
      // @ 300ms resolve remaining item 5 (last remaining batch)
      setTimeout(() => {
        resolve(i);
      }, 100);
    }))
    .then(results => {
      console.log(results); // [1,2,3,4,5]
    });

    0 讨论(0)
  • 2020-11-29 21:50

    So many good solutions. I started out with the elegant solution posted by @Endless and ended up with this little extension method that does not use any external libraries nor does it run in batches (although assumes you have features like async, etc):

    Promise.allWithLimit = async (taskList, limit = 5) => {
        const iterator = taskList.entries();
        let results = new Array(taskList.length);
        let workerThreads = new Array(limit).fill(0).map(() => 
            new Promise(async (resolve, reject) => {
                try {
                    let entry = iterator.next();
                    while (!entry.done) {
                        let [index, promise] = entry.value;
                        try {
                            results[index] = await promise;
                            entry = iterator.next();
                        }
                        catch (err) {
                            results[index] = err;
                        }
                    }
                    // No more work to do
                    resolve(true); 
                }
                catch (err) {
                    // This worker is dead
                    reject(err);
                }
            }));
    
        await Promise.all(workerThreads);
        return results;
    };
    

        Promise.allWithLimit = async (taskList, limit = 5) => {
            const iterator = taskList.entries();
            let results = new Array(taskList.length);
            let workerThreads = new Array(limit).fill(0).map(() => 
                new Promise(async (resolve, reject) => {
                    try {
                        let entry = iterator.next();
                        while (!entry.done) {
                            let [index, promise] = entry.value;
                            try {
                                results[index] = await promise;
                                entry = iterator.next();
                            }
                            catch (err) {
                                results[index] = err;
                            }
                        }
                        // No more work to do
                        resolve(true); 
                    }
                    catch (err) {
                        // This worker is dead
                        reject(err);
                    }
                }));
        
            await Promise.all(workerThreads);
            return results;
        };
    
        const demoTasks = new Array(10).fill(0).map((v,i) => new Promise(resolve => {
           let n = (i + 1) * 5;
           setTimeout(() => {
              console.log(`Did nothing for ${n} seconds`);
              resolve(n);
           }, n * 1000);
        }));
    
        var results = Promise.allWithLimit(demoTasks);

    0 讨论(0)
提交回复
热议问题