NodeJs delay each promise within Promise.all()

非 Y 不嫁゛ 提交于 2019-12-11 07:29:26

问题


I'm trying to update a tool that was created a while ago which uses nodejs (I am not a JS developer, so I'm trying to piece the code together) and am getting stuck at the last hurdle.

The new functionality will take in a swagger .json definition, compare the endpoints against the matching API Gateway on the AWS Service, using the 'aws-sdk' SDK for JS and then updates the Gateway accordingly.

The code runs fine on a small definition file (about 15 endpoints) but as soon as I give it a bigger one, I start getting tons of TooManyRequestsException errors.

I understand that this is due to my calls to the API Gateway service being too quick and a delay / pause is needed. This is where I am stuck

I have tried adding;

  • a delay() to each promise being returned
  • running a setTimeout() in each promise
  • adding a delay to the Promise.all and Promise.mapSeries

Currently my code loops through each endpoint within the definition and then adds the response of each promise to a promise array:

promises.push(getMethodResponse(resourceMethod, value, apiName, resourcePath)); 

Once the loop is finished I run this:

        return Promise.all(promises)
        .catch((err) => {
            winston.error(err);
        })

I have tried the same with a mapSeries (no luck).

It looks like the functions within the (getMethodResponse promise) are run immediately and hence, no matter what type of delay I add they all still just execute. My suspicious is that the I need to make (getMethodResponse) return a function and then use mapSeries but I cant get this to work either.

Code I tried: Wrapped the getMethodResponse in this:

return function(value){}

Then added this after the loop (and within the loop - no difference):

 Promise.mapSeries(function (promises) {
 return 'a'();
 }).then(function (results) {
 console.log('result', results);
 });

Also tried many other suggestions:

Here

Here

Any suggestions please?

EDIT

As request, some additional code to try pin-point the issue.

The code currently working with a small set of endpoints (within the Swagger file):

module.exports = (apiName, externalUrl) => {

return getSwaggerFromHttp(externalUrl)
    .then((swagger) => {
        let paths = swagger.paths;
        let resourcePath = '';
        let resourceMethod = '';
        let promises = [];

        _.each(paths, function (value, key) {
            resourcePath = key;
            _.each(value, function (value, key) {
                resourceMethod = key;
                let statusList = [];
                _.each(value.responses, function (value, key) {
                    if (key >= 200 && key <= 204) {
                        statusList.push(key)
                    }
                });
                _.each(statusList, function (value, key) { //Only for 200-201 range  

                    //Working with small set 
                    promises.push(getMethodResponse(resourceMethod, value, apiName, resourcePath))
                });             
            });
        });

        //Working with small set
        return Promise.all(promises)
        .catch((err) => {
            winston.error(err);
        })
    })
    .catch((err) => {
        winston.error(err);
    });

};

I have since tried adding this in place of the return Promise.all():

            Promise.map(promises, function() {
            // Promise.map awaits for returned promises as well.
            console.log('X');
        },{concurrency: 5})
        .then(function() {
            return console.log("y");
        });

Results of this spits out something like this (it's the same for each endpoint, there are many):

Error: TooManyRequestsException: Too Many Requests X Error: TooManyRequestsException: Too Many Requests X Error: TooManyRequestsException: Too Many Requests

The AWS SDK is being called 3 times within each promise, the functions of which are (get initiated from the getMethodResponse() function):

apigateway.getRestApisAsync()
return apigateway.getResourcesAsync(resourceParams)
apigateway.getMethodAsync(params, function (err, data) {}

The typical AWS SDK documentation state that this is typical behaviour for when too many consecutive calls are made (too fast). I've had a similar issue in the past which was resolved by simply adding a .delay(500) into the code being called;

Something like:

    return apigateway.updateModelAsync(updateModelParams)
    .tap(() => logger.verbose(`Updated model ${updatedModel.name}`))
    .tap(() => bar.tick())
    .delay(500)

EDIT #2

I thought in the name of thorough-ness, to include my entire .js file.

'use strict';

const AWS = require('aws-sdk');
let apigateway, lambda;
const Promise = require('bluebird');
const R = require('ramda');
const logger = require('../logger');
const config = require('../config/default');
const helpers = require('../library/helpers');
const winston = require('winston');
const request = require('request');
const _ = require('lodash');
const region = 'ap-southeast-2';
const methodLib = require('../aws/methods');

const emitter = require('../library/emitter');
emitter.on('updateRegion', (region) => {
    region = region;
    AWS.config.update({ region: region });
    apigateway = new AWS.APIGateway({ apiVersion: '2015-07-09' });
    Promise.promisifyAll(apigateway);
});

function getSwaggerFromHttp(externalUrl) {
    return new Promise((resolve, reject) => {
        request.get({
            url: externalUrl,
            header: {
                "content-type": "application/json"
            }
        }, (err, res, body) => {
            if (err) {
                winston.error(err);
                reject(err);
            }

            let result = JSON.parse(body);
            resolve(result);
        })
    });
}

/*
    Deletes a method response
*/
function deleteMethodResponse(httpMethod, resourceId, restApiId, statusCode, resourcePath) {

    let methodResponseParams = {
        httpMethod: httpMethod,
        resourceId: resourceId,
        restApiId: restApiId,
        statusCode: statusCode
    };

    return apigateway.deleteMethodResponseAsync(methodResponseParams)
        .delay(1200)
        .tap(() => logger.verbose(`Method response ${statusCode} deleted for path: ${resourcePath}`))
        .error((e) => {
            return console.log(`Error deleting Method Response ${httpMethod} not found on resource path: ${resourcePath} (resourceId: ${resourceId})`); // an error occurred
            logger.error('Error: ' + e.stack)
        });
}

/*
    Deletes an integration response
*/
function deleteIntegrationResponse(httpMethod, resourceId, restApiId, statusCode, resourcePath) {

    let methodResponseParams = {
        httpMethod: httpMethod,
        resourceId: resourceId,
        restApiId: restApiId,
        statusCode: statusCode
    };

    return apigateway.deleteIntegrationResponseAsync(methodResponseParams)
        .delay(1200)
        .tap(() => logger.verbose(`Integration response ${statusCode} deleted for path ${resourcePath}`))
        .error((e) => {
            return console.log(`Error deleting Integration Response ${httpMethod} not found on resource path: ${resourcePath} (resourceId: ${resourceId})`); // an error occurred
            logger.error('Error: ' + e.stack)
        });
}

/*
    Get Resource
*/
function getMethodResponse(httpMethod, statusCode, apiName, resourcePath) {

    let params = {
        httpMethod: httpMethod.toUpperCase(),
        resourceId: '',
        restApiId: ''
    }

    return getResourceDetails(apiName, resourcePath)
        .error((e) => {
            logger.unimportant('Error: ' + e.stack)
        }) 
        .then((result) => {
            //Only run the comparrison of models if the resourceId (from the url passed in) is found within the AWS Gateway
            if (result) {
                params.resourceId = result.resourceId
                params.restApiId = result.apiId

                var awsMethodResponses = [];
                try {
                    apigateway.getMethodAsync(params, function (err, data) {
                        if (err) {
                            if (err.statusCode == 404) {
                                return console.log(`Method ${params.httpMethod} not found on resource path: ${resourcePath} (resourceId: ${params.resourceId})`); // an error occurred
                            }
                            console.log(err, err.stack); // an error occurred
                        }
                        else {
                            if (data) {
                                _.each(data.methodResponses, function (value, key) {
                                    if (key >= 200 && key <= 204) {
                                        awsMethodResponses.push(key)
                                    }
                                });
                                awsMethodResponses = _.pull(awsMethodResponses, statusCode); //List of items not found within the Gateway - to be removed.
                                _.each(awsMethodResponses, function (value, key) {
                                    if (data.methodResponses[value].responseModels) {
                                        var existingModel = data.methodResponses[value].responseModels['application/json']; //Check if there is currently a model attached to the resource / method about to be deleted
                                        methodLib.updateResponseAssociation(params.httpMethod, params.resourceId, params.restApiId, statusCode, existingModel); //Associate this model to the same resource / method, under the new response status
                                    }
                                    deleteMethodResponse(params.httpMethod, params.resourceId, params.restApiId, value, resourcePath)
                                        .delay(1200)
                                        .done();
                                    deleteIntegrationResponse(params.httpMethod, params.resourceId, params.restApiId, value, resourcePath)
                                        .delay(1200)
                                        .done();
                                })
                            }
                        }
                    })
                        .catch(err => {
                            console.log(`Error: ${err}`);
                        });
                }
                catch (e) {
                    console.log(`getMethodAsync failed, Error: ${e}`);
                }
            }
        })
};

function getResourceDetails(apiName, resourcePath) {

    let resourceExpr = new RegExp(resourcePath + '$', 'i');

    let result = {
        apiId: '',
        resourceId: '',
        path: ''
    }

    return helpers.apiByName(apiName, AWS.config.region)
        .delay(1200)
        .then(apiId => {
            result.apiId = apiId;

            let resourceParams = {
                restApiId: apiId,
                limit: config.awsGetResourceLimit,
            };

            return apigateway.getResourcesAsync(resourceParams)

        })
        .then(R.prop('items'))
        .filter(R.pipe(R.prop('path'), R.test(resourceExpr)))
        .tap(helpers.handleNotFound('resource'))
        .then(R.head)
        .then([R.prop('path'), R.prop('id')])
        .then(returnedObj => {
            if (returnedObj.id) {
                result.path = returnedObj.path;
                result.resourceId = returnedObj.id;
                logger.unimportant(`ApiId: ${result.apiId} | ResourceId: ${result.resourceId} | Path: ${result.path}`);
                return result;
            }
        })
        .catch(err => {
            console.log(`Error: ${err} on API: ${apiName} Resource: ${resourcePath}`);
        });
};

function delay(t) {
    return new Promise(function(resolve) { 
        setTimeout(resolve, t)
    });
 }

module.exports = (apiName, externalUrl) => {

    return getSwaggerFromHttp(externalUrl)
        .then((swagger) => {
            let paths = swagger.paths;
            let resourcePath = '';
            let resourceMethod = '';
            let promises = [];

            _.each(paths, function (value, key) {
                resourcePath = key;
                _.each(value, function (value, key) {
                    resourceMethod = key;
                    let statusList = [];
                    _.each(value.responses, function (value, key) {
                        if (key >= 200 && key <= 204) {
                            statusList.push(key)
                        }
                    });
                    _.each(statusList, function (value, key) { //Only for 200-201 range  

                        promises.push(getMethodResponse(resourceMethod, value, apiName, resourcePath))

                    });             
                });
            });

            //Working with small set
            return Promise.all(promises)
            .catch((err) => {
                winston.error(err);
            })
        })
        .catch((err) => {
            winston.error(err);
        });
};

回答1:


You apparently have a misunderstanding about what Promise.all() and Promise.map() do.

All Promise.all() does is keep track of a whole array of promises to tell you when the async operations they represent are all done (or one returns an error). When you pass it an array of promises (as you are doing), ALL those async operations have already been started in parallel. So, if you're trying to limit how many async operations are in flight at the same time, it's already too late at that point. So, Promise.all() by itself won't help you control how many are running at once in any way.

I've also noticed since, that it seems this line promises.push(getMethodResponse(resourceMethod, value, apiName, resourcePath)) is actually executing promises and not simply adding them to the array. Seems like the last Promise.all() doesn't actually do much.

Yep, when you execute promises.push(getMethodResponse()), you are calling getMethodResponse() immediately right then. That starts the async operation immediately. That function then returns a promise and Promise.all() will monitor that promise (along with all the other ones you put in the array) to tell you when they are all done. That's all Promise.all() does. It monitors operations you've already started. To keep the max number of requests in flight at the same time below some threshold, you have to NOT START the async operations all at once like you are doing. Promise.all() does not do that for you.


For Bluebird's Promise.map() to help you at all, you have to pass it an array of DATA, not promises. When you pass it an array of promises that represent async operations that you've already started, it can do no more than Promise.all() can do. But, if you pass it an array of data and a callback function that can then initiate an async operation for each element of data in the array, THEN it can help you when you use the concurrency option.

Your code is pretty complex so I will illustrate with a simple web scraper that wants to read a large list of URLs, but for memory considerations, only process 20 at a time.

const rp = require('request-promise');
let urls = [...];    // large array of URLs to process

Promise.map(urls, function(url) {
    return rp(url).then(function(data) {
        // process scraped data here
        return someValue;
    });
}, {concurrency: 20}).then(function(results) {
   // process array of results here
}).catch(function(err) {
    // error here
});

In this example, hopefully you can see that an array of data items are being passed into Promise.map() (not an array of promises). This, then allows Promise.map() to manage how/when the array is processed and, in this case, it will use the concurrency: 20 setting to make sure that no more than 20 requests are in flight at the same time.


Your effort to use Promise.map() was passing an array of promises, which does not help you since the promises represent async operations that have already been started:

Promise.map(promises, function() {
    ...
});

Then, in addition, you really need to figure out what exactly causes the TooManyRequestsException error by either reading documentation on the target API that exhibits this or by doing a whole bunch of testing because there can be a variety of things that might cause this and without knowing exactly what you need to control, it just takes a lot of wild guesses to try to figure out what might work. The most common things that an API might detect are:

  1. Simultaneous requests from the same account or source.
  2. Requests per unit of time from the same account or source (such as request per second).

The concurrency operation in Promise.map() will easily help you with the first option, but will not necessarily help you with the second option as you can limit to a low number of simultaneous requests and still exceed a requests per second limit. The second needs some actual time control. Inserting delay() statements will sometimes work, but even that is not a very direct method of managing it and will either lead to inconsistent control (something that works sometimes, but not other times) or sub-optimal control (limiting yourself to something far below what you can actually use).

To manage to a request per second limit, you need some actual time control with a rate limiting library or actual rate limiting logic in your own code.

Here's an example of a scheme for limiting the number of requests per second you are making: How to Manage Requests to Stay Below Rate Limiting.



来源:https://stackoverflow.com/questions/47383610/nodejs-delay-each-promise-within-promise-all

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!