How to setup tfserving with inception/mobilenet model for image classification?

问题

I'm unable to find the proper documentation to successfully serve the inception or mobilenet models and write a grpc client to connect to the server and perform image classification.

Till now, I've successfully configured the tfserving image on CPU only. Unable to run it on my GPU.

But, when I make a grpc client request, the request fails with the error.

grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
status = StatusCode.INVALID_ARGUMENT
details = "Expects arg[0] to be float but string is provided"
debug_error_string = "{"created":"@1571717090.210000000","description":"Error received from peer","file":"src/core/lib/surface/call.cc","file_line":1017,"grpc_message":"Expects arg[0] to be float but string is provided","grpc_status":3}"

I understand there is some issue in the request format but I couldn't find a proper documentation for the grpc client that can pin-point to correct direction.

Here's the grpc client that I used for the request.

from __future__ import print_function

import grpc
import tensorflow as tf
import time

from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_service_pb2_grpc

tf.app.flags.DEFINE_string('server', 'localhost:8505',
                       'PredictionService host:port')
tf.app.flags.DEFINE_string('image', 'E:/Data/Docker/tf_serving/cat.jpg', '‪path to image')
FLAGS = tf.app.flags.FLAGS


def main(_):
    channel = grpc.insecure_channel(FLAGS.server)
    stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)

    # Send request
    with open(FLAGS.image, 'rb') as f:
        # See prediction_service.proto for gRPC request/response details.
        data = f.read()
        request = predict_pb2.PredictRequest()
        request.model_spec.name = 'inception'
        request.model_spec.signature_name = ''
        request.inputs['image'].CopyFrom(tf.contrib.util.make_tensor_proto(data, shape=[1]))
        result = stub.Predict(request, 5.0)  # 10 secs timeout
        print(result)
    print("Inception Client Passed")


if __name__ == '__main__':
    tf.app.run()

回答1:

Like I understood, there are 2 issues in your question.

A) Running tfserving on GPU.

B) Making a successfully grpc client request.

Let's start one-by-one.

Running tfserving on GPU

It is simple 2-step process.

Pulling latest image from the official docker hub page.
```
docker pull tensorflow/serving:latest-gpu
```

Please note the label latest-gpu in above pull request as it pulls image meant for GPU.

Running the docker container.

sudo docker run -p 8502:8500 --mount type=bind,source=/my_model_dir,target=/models/inception --name tfserve_gpu -e MODEL_NAME=inception --gpus device=3 -t tensorflow/serving:latest-gpu

Please note, I've passed argument --gpus device=3 to select the 3rd GPU device. Change it accordingly to select a different GPU device.

Verify, if the container has been started by docker ps command.

Also, verify if the gpu has been allocated for the tfserving docker by nvidia-smi command.

Output of nvidia-smi

But here seems a little problem. The tfserving docker has consumed all of gpu device memory.

To restrict gpu memory usage, use per_process_gpu_memory_fraction flag.

sudo docker run -p 8502:8500 --mount type=bind,source=/my_model_dir,target=/models/inception --name tfserve_gpu -e MODEL_NAME=inception --gpus device=3 -t tensorflow/serving:latest-gpu  --per_process_gpu_memory_fraction=0.02

Output of nvidia-smi

Now, we have successfully configured tfserving docker on GPU device with reasonable gpu memory usage. Lets jump to the second problem.

Making GRPC client request

There is issue in formatting of your grpc client request. The tfserving docker image doesn't takes image in binary format directly, instead you'll have to make a tensor for that image and then pass it to the server.

Here's the code for making the grpc client request.

from __future__ import print_function

import argparse
import time
import numpy as np
from cv2 import imread

import grpc
from tensorflow.contrib.util import make_tensor_proto
from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_service_pb2_grpc
import tensorflow as tf


def read_tensor_from_image_file(file_name,
                                input_height=299,
                                input_width=299,
                                input_mean=0,
                                input_std=255):
    input_name = "file_reader"
    output_name = "normalized"
    file_reader = tf.io.read_file(file_name, input_name)
    if file_name.endswith(".png"):
        image_reader = tf.image.decode_png(
            file_reader, channels=3, name="png_reader")
    elif file_name.endswith(".gif"):
        image_reader = tf.squeeze(
            tf.image.decode_gif(file_reader, name="gif_reader"))
    elif file_name.endswith(".bmp"):
        image_reader = tf.image.decode_bmp(file_reader, name="bmp_reader")
    else:
        image_reader = tf.image.decode_jpeg(
            file_reader, channels=3, name="jpeg_reader")
    float_caster = tf.cast(image_reader, tf.float32)
    dims_expander = tf.expand_dims(float_caster, 0)

    resized = tf.compat.v1.image.resize_bilinear(dims_expander, [input_height, input_width])
    normalized = tf.divide(tf.subtract(resized, [input_mean]), [input_std])

    sess = tf.Session(config=tf.ConfigProto(gpu_options=tf.GPUOptions(per_process_gpu_memory_fraction=0.01)))
    result = sess.run(normalized)

    return result


def run(host, port, image, model, signature_name):

    # Preparing tensor from the image
    tensor = read_tensor_from_image_file(file_name='images/bird.jpg', input_height=224, input_width=224, input_mean=128, input_std=128)

    # Preparing the channel
    channel = grpc.insecure_channel('{host}:{port}'.format(host=host, port=port))
    stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)

    # Preparing grpc request
    request = predict_pb2.PredictRequest()
    request.model_spec.name = model
    request.model_spec.signature_name = signature_name
    request.inputs['image'].CopyFrom(make_tensor_proto(tensor, shape=[1, 224, 224, 3]))

    # Making predict request
    result = stub.Predict(request, 10.0)

    # Analysing result to get the prediction output.
    predictions = result.outputs['prediction'].float_val

    print("Predictions : ", predictions)


if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('--host', help='Tensorflow server host name', default='localhost', type=str)
    parser.add_argument('--port', help='Tensorflow server port number', default=8502, type=int)
    parser.add_argument('--image', help='input image', default='bird.jpg', type=str)
    parser.add_argument('--model', help='model name', default='inception', type=str)
    parser.add_argument('--signature_name', help='Signature name of saved TF model',
                        default='serving_default', type=str)

    args = parser.parse_args()
    run(args.host, args.port, args.image, args.model, args.signature_name)

I'm not very sure whether this is the best way to make tfserving grpc client request (since tensorflow library is required at the client end to prepare the tensor) but it works for me.

Suggestions are welcomed if any.

来源：https://stackoverflow.com/questions/58497010/how-to-setup-tfserving-with-inception-mobilenet-model-for-image-classification

标签

tensorflow

grpc

tensorflow-serving

tensorflow2.0