问题
I uploaded my model to ML-engine and when trying to make a prediction I receive the following error:
ERROR: (gcloud.ml-engine.predict) HTTP request failed. Response: { "error": {
"code": 429,
"message": "Prediction server is out of memory, possibly because model size is too big.",
"status": "RESOURCE_EXHAUSTED" } }
My model size is 151.1 MB. I already did all the suggested actions from google cloud website such as quantise. Is there a possible solution or any other thing I could do to make it work?
Thanks
回答1:
Typically a model of this size should not result in OOM. Since TF does a lot of lazy initialization, some OOMs won't be detected until the first request to initialize the data structure. In rare case certain graph can explode 10x in memory causing OOM.
1) Did you see the prediction error consistently? Due to the way Tensorflow schedules nodes the memory usage for the same graph might be different across runs. Make sure to run prediction multiple times and see if it's 429 every time.
2) Please make sure 151.1MB is the size of your SavedModel Directory.
3) You can also debug the peak memory locally, for instance using top
when running gcloud ml-engine local predict
or by loading the model into memory in a docker container and use docker stats or some other way to monitor memory usage. You can try tensorflow serving for debugging (https://www.tensorflow.org/serving/serving_basic) and post the results.
4) If you find the memory problem is persistent, please contact cloudml-feedback@google.com for further assistance, make sure you include your project number and associated account for further debugging.
来源:https://stackoverflow.com/questions/49304175/google-cloud-ml-engine-error-429-out-of-memory