Google Cloud ML FAILED_PRECONDITION

问题

I am trying to use Google Cloud ML to host a Tensorflow model and get predictions. I have a pretrained model that I have uploaded to the cloud and I have created a model and version in my Cloud ML console.

I followed the instructions from here to prepare my data for requesting online predictions. For both the Python method and the glcoud method I get the same error. For simplicity, I'll post the gcloud method:

I run gcloud ml-engine predict --model spell_correction --json-instances test.json where test.json is my input data file (a JSON array named instances). I get the following result:

ERROR: (gcloud.ml-engine.predict) HTTP request failed. Response: {
  "error": {
  "code": 400,
  "message": "Precondition check failed.",
  "status": "FAILED_PRECONDITION"
  }
}

How can I get more details about this? The same exact error happens when I try via Python and there I have a googleapiclient.http.HttpRequest object containing the error. I just want to know why this error is happening other than this generic error. Does anyone know how to get more details via either the Python method or the gcloud method? I am assuming that since it is the same error, it is the same root cause.

Output of gcloud ml-engine models list:

NAME              DEFAULT_VERSION_NAME
spell_correction  testing

Output of gcloud ml-engine versions list --model spell_correction

NAME     DEPLOYMENT_URI
testing  gs://<my-bucket>/output/1/

test.json: {"instances": [{"tokens": [[9], [4], [11], [9]], "mask": [[18], [7], [12], [30]], "keep_prob": 1.0, "beam": 64}]}

My inputs to the model:

tokens: tf.placeholder(tf.int32, shape=[None, None])

mask: tf.placeholder(tf.int32, shape=[None, None])

keep_prob: tf.placeholder(tf.float32)

beam: tf.placeholder(tf.int32)

When calling via python, the request_body is just test.json as a string.

回答1:

A side note: did you try "local predict" (https://cloud.google.com/sdk/gcloud/reference/ml-engine/local/predict) with your model first? You might be able to get more information there first.

回答2:

After talking to Google Cloud ML support, I got this working.

The main issue I noticed was that all of the data in test.json gets wrapped in a list when it is sent to your model. I solved this by removing the outer list from tokens and mask in my file above. I also changed keep_prob and beam to constants as I do not want them to be able to change for every prediction I make.

As a general piece of advice, the error messages provided through the Python call were much more useful to me than the error messages from gcloud ml-engine predict. Also ensure to keep your gcloud install up-to-date, they are working on fixes almost constantly.

来源：https://stackoverflow.com/questions/42765968/google-cloud-ml-failed-precondition

标签

python

tensorflow

gcloud

tensorflow-serving

google-cloud-ml