I\'m following the Serving Inception Model with TensorFlow Serving and Kubernetes workflow and everything work well up to the point of the final serving of the inception mod
The error message seems to indicate that your client cannot connect to the server. Without some additional information it is hard to trouble shoot. If you post your deployment and service configuration as well as give some information about the environement (is it running on a cloud? which one? what are your security rules? load balancers?) we may be able to help better.
But here some things that you can check right away:
If you are running in some kind of cloud environment (Amazon, Google, Azure, etc.), they all have security rules where you need to explicitly open the ports on the nodes running your kubernetes cluster. So every port that your Tensorflow deployment/service is using should be opened on the Controller and Worker nodes.
Did you deploy only a Deployment
for the app or also a Service
? If you run a Service
how does it expose? Did you forget to enable a NodePort
?
Update: Your service type is load balancer. So there should be a separate load balancer be created in GCE. you need to get the IP of the load balancer and access the service through the load balancer's ip. Please see the section 'Finding Your IP' in this link https://kubernetes.io/docs/tasks/access-application-cluster/create-external-load-balancer/
I figured it out with the help of several tensorflow experts. Things started to work after I introduced the following changes:
First, I changed inception_k8s.yaml file in the following way:
Source:
args:
- /serving/bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server
--port=9000 --model_name=inception --model_base_path=/serving/inception-export
Modification:
args:
- serving/bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server
--port=9000 --model_name=inception --model_base_path=serving/inception-export
Second, I exposed the deployment:
kubectl expose deployments inception-deployment --type=“LoadBalancer”
and I used the IP generated from exposing the deployment, not the inception-service IP.
From this point I am able to run the inference from an external host where the client is installed using the command from the Serving Inception Model with TensorFlow Serving and Kubernetes.