What is the best way to limit concurrency when using ES6's Promise.all()?

前端 未结 17 608
执念已碎
执念已碎 2020-11-29 21:28

I have some code that is iterating over a list that was queried out of a database and making an HTTP request for each element in that list. That list can sometimes be a rea

相关标签:
17条回答
  • 2020-11-29 22:06

    Instead of using promises for limiting http requests, use node's built-in http.Agent.maxSockets. This removes the requirement of using a library or writing your own pooling code, and has the added advantage more control over what you're limiting.

    agent.maxSockets

    By default set to Infinity. Determines how many concurrent sockets the agent can have open per origin. Origin is either a 'host:port' or 'host:port:localAddress' combination.

    For example:

    var http = require('http');
    var agent = new http.Agent({maxSockets: 5}); // 5 concurrent connections per origin
    var request = http.request({..., agent: agent}, ...);
    

    If making multiple requests to the same origin, it might also benefit you to set keepAlive to true (see docs above for more info).

    0 讨论(0)
  • 2020-11-29 22:11

    Note that Promise.all() doesn't trigger the promises to start their work, creating the promise itself does.

    With that in mind, one solution would be to check whenever a promise is resolved whether a new promise should be started or whether you're already at the limit.

    However, there is really no need to reinvent the wheel here. One library that you could use for this purpose is es6-promise-pool. From their examples:

    // On the Web, leave out this line and use the script tag above instead. 
    var PromisePool = require('es6-promise-pool')
    
    var promiseProducer = function () {
      // Your code goes here. 
      // If there is work left to be done, return the next work item as a promise. 
      // Otherwise, return null to indicate that all promises have been created. 
      // Scroll down for an example. 
    }
    
    // The number of promises to process simultaneously. 
    var concurrency = 3
    
    // Create a pool. 
    var pool = new PromisePool(promiseProducer, concurrency)
    
    // Start the pool. 
    var poolPromise = pool.start()
    
    // Wait for the pool to settle. 
    poolPromise.then(function () {
      console.log('All promises fulfilled')
    }, function (error) {
      console.log('Some promise rejected: ' + error.message)
    })
    
    0 讨论(0)
  • 2020-11-29 22:11

    P-Limit

    I have compared promise concurrency limitation with a custom script, bluebird, es6-promise-pool, and p-limit. I believe that p-limit has the most simple, stripped down implementation for this need. See their documentation.

    Requirements

    To be compatible with async in example

    • ECMAScript 2017 (version 8)
    • Node version > 8.2.1

    My Example

    In this example, we need to run a function for every URL in the array (like, maybe an API request). Here this is called fetchData(). If we had an array of thousands of items to process, concurrency would definitely be useful to save on CPU and memory resources.

    const pLimit = require('p-limit');
    
    // Example Concurrency of 3 promise at once
    const limit = pLimit(3);
    
    let urls = [
        "http://www.exampleone.com/",
        "http://www.exampletwo.com/",
        "http://www.examplethree.com/",
        "http://www.examplefour.com/",
    ]
    
    // Create an array of our promises using map (fetchData() returns a promise)
    let promises = urls.map(url => {
    
        // wrap the function we are calling in the limit function we defined above
        return limit(() => fetchData(url));
    });
    
    (async () => {
        // Only three promises are run at once (as defined above)
        const result = await Promise.all(promises);
        console.log(result);
    })();
    

    The console log result is an array of your resolved promises response data.

    0 讨论(0)
  • 2020-11-29 22:11
    • @tcooc's answer was quite cool. Didn't know about it and will leverage it in the future.
    • I also enjoyed @MatthewRideout's answer, but it uses an external library!!

    Whenever possible, I give a shot at developing this kind of things on my own, rather than going for a library. You end up learning a lot of concepts which seemed daunting before.

    What do you guys think of this attempt:
    (I gave it a lot of thought and I think it is working, but do point out if it isn't or there is something fundamentally wrong)

     class Pool{
            constructor(maxAsync) {
                this.maxAsync = maxAsync;
                this.asyncOperationsQueue = [];
                this.currentAsyncOperations = 0
            }
    
            runAnother() {
                if (this.asyncOperationsQueue.length > 0 && this.currentAsyncOperations < this.maxAsync) {
                    this.currentAsyncOperations += 1;
                    this.asyncOperationsQueue.pop()()
                        .then(() => { this.currentAsyncOperations -= 1; this.runAnother() }, () => { this.currentAsyncOperations -= 1; this.runAnother() })
                }
            }
    
            add(f){  // the argument f is a function of signature () => Promise
                this.runAnother();
                return new Promise((resolve, reject) => {
                    this.asyncOperationsQueue.push(
                        () => f().then(resolve).catch(reject)
                    )
                })
            }
        }
    
    //#######################################################
    //                        TESTS
    //#######################################################
    
    function dbCall(id, timeout, fail) {
        return new Promise((resolve, reject) => {
            setTimeout(() => {
                if (fail) {
                   reject(`Error for id ${id}`);
                } else {
                    resolve(id);
                }
            }, timeout)
        }
        )
    }
    
    
    const dbQuery1 = () => dbCall(1, 5000, false);
    const dbQuery2 = () => dbCall(2, 5000, false);
    const dbQuery3 = () => dbCall(3, 5000, false);
    const dbQuery4 = () => dbCall(4, 5000, true);
    const dbQuery5 = () => dbCall(5, 5000, false);
    
    
    const cappedPool = new Pool(2);
    
    const dbQuery1Res = cappedPool.add(dbQuery1).catch(i => i).then(i => console.log(`Resolved: ${i}`))
    const dbQuery2Res = cappedPool.add(dbQuery2).catch(i => i).then(i => console.log(`Resolved: ${i}`))
    const dbQuery3Res = cappedPool.add(dbQuery3).catch(i => i).then(i => console.log(`Resolved: ${i}`))
    const dbQuery4Res = cappedPool.add(dbQuery4).catch(i => i).then(i => console.log(`Resolved: ${i}`))
    const dbQuery5Res = cappedPool.add(dbQuery5).catch(i => i).then(i => console.log(`Resolved: ${i}`))

    This approach provides a nice API, similar to thread pools in scala/java.
    After creating one instance of the pool with const cappedPool = new Pool(2), you provide promises to it with simply cappedPool.add(() => myPromise).
    Obliviously we must ensure that the promise does not start immediately and that is why we must "provide it lazily" with the help of the function.

    Most importantly, notice that the result of the method add is a Promise which will be completed/resolved with the value of your original promise! This makes for a very intuitive use.

    const resultPromise = cappedPool.add( () => dbCall(...))
    resultPromise
    .then( actualResult => {
       // Do something with the result form the DB
      }
    )
    
    0 讨论(0)
  • 2020-11-29 22:12

    expanding on the answer posted by @deceleratedcaviar, I created a 'batch' utility function that takes as argument: array of values, concurrency limit and processing function. Yes I realize that using Promise.all this way is more akin to batch processing vs true concurrency, but if the goal is to limit excessive number of HTTP calls at one time I go with this approach due to its simplicity and no need for external library.

    async function batch(o) {
      let arr = o.arr
      let resp = []
      while (arr.length) {
        let subset = arr.splice(0, o.limit)
        let results = await Promise.all(subset.map(o.process))
        resp.push(results)
      }
      return [].concat.apply([], resp)
    }
    
    let arr = []
    for (let i = 0; i < 250; i++) { arr.push(i) }
    
    async function calc(val) { return val * 100 }
    
    (async () => {
      let resp = await batch({
        arr: arr,
        limit: 100,
        process: calc
      })
      console.log(resp)
    })();

    0 讨论(0)
提交回复
热议问题