Is there a way to turn on SageMaker model endpoints only when I am receiving inference requests

左心房为你撑大大i 提交于 2020-07-10 10:25:50

问题


I have created a model endpoint which is InService and deployed on an ml.m4.xlarge instance. I am also using API Gateway to create a RESTful API.

Questions:

  1. Is it possible to have my model endpoint only Inservice (or on standby) when I receive inference requests? Maybe by writing a lambda function or something that turns off the endpoint (so that it does not keep accumulating the per hour charges)

  2. If q1 is possible, would this have some weird latency issues on the end users? Because it usually takes a couple of minutes for model endpoints to be created when I configure them for the first time.

  3. If q1 is not possible, how would choosing a cheaper instance type affect the time it takes to perform inference (Say I'm only using the endpoints for an application that has a low number of users).

I am aware of this site that compares different instance types (https://aws.amazon.com/sagemaker/pricing/instance-types/)

But, does having a moderate network performance mean that the time to perform realtime inference may be longer?

Any recommendations are much appreciated. The goal is not to burn money when users are not requesting for predictions.


回答1:


How large is your model? If it is under the 50 MB size limit required by AWS Lambda and the dependencies are small enough, there could be a way to rely directly on Lambda as an execution engine.

If your model is larger than 50 MB, there might still be a way to run it by storing it on EFS. See EFS for Lambda.



来源:https://stackoverflow.com/questions/62765780/is-there-a-way-to-turn-on-sagemaker-model-endpoints-only-when-i-am-receiving-inf

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!