问题
I'm unable to find the proper documentation to successfully serve the inception or mobilenet models and write a grpc client to connect to the server and perform image classification.
Till now, I've successfully configured the tfserving image on CPU only. Unable to run it on my GPU.
But, when I make a grpc client request, the request fails with the error.
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
status = StatusCode.INVALID_ARGUMENT
details = "Expects arg[0] to be float but string is provided"
debug_error_string = "{"created":"@1571717090.210000000","description":"Error received from peer","file":"src/core/lib/surface/call.cc","file_line":1017,"grpc_message":"Expects arg[0] to be float but string is provided","grpc_status":3}"
I understand there is some issue in the request format but I couldn't find a proper documentation for the grpc client that can pin-point to correct direction.
Here's the grpc client that I used for the request.
from __future__ import print_function
import grpc
import tensorflow as tf
import time
from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_service_pb2_grpc
tf.app.flags.DEFINE_string('server', 'localhost:8505',
'PredictionService host:port')
tf.app.flags.DEFINE_string('image', 'E:/Data/Docker/tf_serving/cat.jpg', 'path to image')
FLAGS = tf.app.flags.FLAGS
def main(_):
channel = grpc.insecure_channel(FLAGS.server)
stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)
# Send request
with open(FLAGS.image, 'rb') as f:
# See prediction_service.proto for gRPC request/response details.
data = f.read()
request = predict_pb2.PredictRequest()
request.model_spec.name = 'inception'
request.model_spec.signature_name = ''
request.inputs['image'].CopyFrom(tf.contrib.util.make_tensor_proto(data, shape=[1]))
result = stub.Predict(request, 5.0) # 10 secs timeout
print(result)
print("Inception Client Passed")
if __name__ == '__main__':
tf.app.run()
回答1:
Like I understood, there are 2 issues in your question.
A) Running tfserving on GPU.
B) Making a successfully grpc client request.
Let's start one-by-one.
Running tfserving on GPU
It is simple 2-step process.
Pulling latest image from the official docker hub page.
docker pull tensorflow/serving:latest-gpu
Please note the label latest-gpu
in above pull request as it pulls image meant for GPU.
Running the docker container.
sudo docker run -p 8502:8500 --mount type=bind,source=/my_model_dir,target=/models/inception --name tfserve_gpu -e MODEL_NAME=inception --gpus device=3 -t tensorflow/serving:latest-gpu
Please note, I've passed argument --gpus device=3
to select the 3rd GPU device. Change it accordingly to select a different GPU device.
Verify, if the container has been started by docker ps
command.
Also, verify if the gpu has been allocated for the tfserving docker by nvidia-smi
command.
Output of nvidia-smi
But here seems a little problem. The tfserving docker has consumed all of gpu device memory.
To restrict gpu memory usage, use per_process_gpu_memory_fraction
flag.
sudo docker run -p 8502:8500 --mount type=bind,source=/my_model_dir,target=/models/inception --name tfserve_gpu -e MODEL_NAME=inception --gpus device=3 -t tensorflow/serving:latest-gpu --per_process_gpu_memory_fraction=0.02
Output of nvidia-smi
Now, we have successfully configured tfserving docker on GPU device with reasonable gpu memory usage. Lets jump to the second problem.
Making GRPC client request
There is issue in formatting of your grpc client request. The tfserving docker image doesn't takes image in binary format directly, instead you'll have to make a tensor for that image and then pass it to the server.
Here's the code for making the grpc client request.
from __future__ import print_function
import argparse
import time
import numpy as np
from cv2 import imread
import grpc
from tensorflow.contrib.util import make_tensor_proto
from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_service_pb2_grpc
import tensorflow as tf
def read_tensor_from_image_file(file_name,
input_height=299,
input_width=299,
input_mean=0,
input_std=255):
input_name = "file_reader"
output_name = "normalized"
file_reader = tf.io.read_file(file_name, input_name)
if file_name.endswith(".png"):
image_reader = tf.image.decode_png(
file_reader, channels=3, name="png_reader")
elif file_name.endswith(".gif"):
image_reader = tf.squeeze(
tf.image.decode_gif(file_reader, name="gif_reader"))
elif file_name.endswith(".bmp"):
image_reader = tf.image.decode_bmp(file_reader, name="bmp_reader")
else:
image_reader = tf.image.decode_jpeg(
file_reader, channels=3, name="jpeg_reader")
float_caster = tf.cast(image_reader, tf.float32)
dims_expander = tf.expand_dims(float_caster, 0)
resized = tf.compat.v1.image.resize_bilinear(dims_expander, [input_height, input_width])
normalized = tf.divide(tf.subtract(resized, [input_mean]), [input_std])
sess = tf.Session(config=tf.ConfigProto(gpu_options=tf.GPUOptions(per_process_gpu_memory_fraction=0.01)))
result = sess.run(normalized)
return result
def run(host, port, image, model, signature_name):
# Preparing tensor from the image
tensor = read_tensor_from_image_file(file_name='images/bird.jpg', input_height=224, input_width=224, input_mean=128, input_std=128)
# Preparing the channel
channel = grpc.insecure_channel('{host}:{port}'.format(host=host, port=port))
stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)
# Preparing grpc request
request = predict_pb2.PredictRequest()
request.model_spec.name = model
request.model_spec.signature_name = signature_name
request.inputs['image'].CopyFrom(make_tensor_proto(tensor, shape=[1, 224, 224, 3]))
# Making predict request
result = stub.Predict(request, 10.0)
# Analysing result to get the prediction output.
predictions = result.outputs['prediction'].float_val
print("Predictions : ", predictions)
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--host', help='Tensorflow server host name', default='localhost', type=str)
parser.add_argument('--port', help='Tensorflow server port number', default=8502, type=int)
parser.add_argument('--image', help='input image', default='bird.jpg', type=str)
parser.add_argument('--model', help='model name', default='inception', type=str)
parser.add_argument('--signature_name', help='Signature name of saved TF model',
default='serving_default', type=str)
args = parser.parse_args()
run(args.host, args.port, args.image, args.model, args.signature_name)
I'm not very sure whether this is the best way to make tfserving grpc client request (since tensorflow library is required at the client end to prepare the tensor) but it works for me.
Suggestions are welcomed if any.
来源:https://stackoverflow.com/questions/58497010/how-to-setup-tfserving-with-inception-mobilenet-model-for-image-classification