tensorflow-serving

How to freeze a device specific saved model?

霸气de小男生 提交于 2020-12-13 09:41:20
问题 I need to freeze saved models for serving, but some saved model is device specific, how to solve this? with tf.Session(config=tf.ConfigProto(allow_soft_placement=True)) as sess: sess.run(tf.tables_initializer()) tf.saved_model.loader.load(sess, [tag_constants.SERVING], saved_model_dir) inference_graph_def=tf.get_default_graph().as_graph_def() for node in inference_graph_def.node: node.device = '' frozen_graph_path = os.path.join(frozen_dir, 'frozen_inference_graph.pb') output_keys = ['ToInt64

How to freeze a device specific saved model?

喜你入骨 提交于 2020-12-13 09:40:17
问题 I need to freeze saved models for serving, but some saved model is device specific, how to solve this? with tf.Session(config=tf.ConfigProto(allow_soft_placement=True)) as sess: sess.run(tf.tables_initializer()) tf.saved_model.loader.load(sess, [tag_constants.SERVING], saved_model_dir) inference_graph_def=tf.get_default_graph().as_graph_def() for node in inference_graph_def.node: node.device = '' frozen_graph_path = os.path.join(frozen_dir, 'frozen_inference_graph.pb') output_keys = ['ToInt64

How to freeze a device specific saved model?

試著忘記壹切 提交于 2020-12-13 09:40:04
问题 I need to freeze saved models for serving, but some saved model is device specific, how to solve this? with tf.Session(config=tf.ConfigProto(allow_soft_placement=True)) as sess: sess.run(tf.tables_initializer()) tf.saved_model.loader.load(sess, [tag_constants.SERVING], saved_model_dir) inference_graph_def=tf.get_default_graph().as_graph_def() for node in inference_graph_def.node: node.device = '' frozen_graph_path = os.path.join(frozen_dir, 'frozen_inference_graph.pb') output_keys = ['ToInt64

TensorFlow model serving on Google AI Platform online prediction too slow with instance batches

可紊 提交于 2020-12-12 02:54:46
问题 I'm trying to deploy a TensorFlow model to Google AI Platform for Online Prediction. I'm having latency and throughput issues . The model runs on my machine in less than 1 second (with only an Intel Core I7 4790K CPU) for a single image. I deployed it to AI Platform on a machine with 8 cores and an NVIDIA T4 GPU. When running the model on AI Platform on the mentioned configuration, it takes a little less than a second when sending only one image. If I start sending many requests, each with

TensorFlow model serving on Google AI Platform online prediction too slow with instance batches

∥☆過路亽.° 提交于 2020-12-12 02:52:56
问题 I'm trying to deploy a TensorFlow model to Google AI Platform for Online Prediction. I'm having latency and throughput issues . The model runs on my machine in less than 1 second (with only an Intel Core I7 4790K CPU) for a single image. I deployed it to AI Platform on a machine with 8 cores and an NVIDIA T4 GPU. When running the model on AI Platform on the mentioned configuration, it takes a little less than a second when sending only one image. If I start sending many requests, each with

How to properly reduce the size of a tensorflow savedmodel?

坚强是说给别人听的谎言 提交于 2020-07-09 09:48:54
问题 I have a tensorflow pre-trained model in a checkpoint form, and I intended to deploy the model for serving by converting the model into the savedmodel form. The size of the saved model is kind of too large. (The "variables.data-00000-of-0001" file in the savedmodel is more than hundred of MBs.) I googled how to reduce the size of variables, but I could not find a good answer. Could you help me understand how to reduce the size of variables in a tensorflow savedmodel? It will be great to show

Segmentation fault (core dumped) - Infering with Tensorflow C++ API from SavedModel

心不动则不痛 提交于 2020-06-29 04:30:07
问题 I am using the Tensorflow C++ API to load a SavedModel and run inference. The model loads fine, but when I run the inference, I have the following error: $ ./bazel-bin/tensorflow/gan_loader/gan_loader 2020-06-21 19:29:18.669604: I tensorflow/cc/saved_model/reader.cc:31] Reading SavedModel from: /home/eduardo/Documents/GitHub/edualvarado/tensorflow/tensorflow/gan_loader/generator_model_final 2020-06-21 19:29:18.671368: I tensorflow/cc/saved_model/reader.cc:54] Reading meta graph with tags {

Generate instances or inputs for TensorFlow Serving REST API

允我心安 提交于 2020-06-27 10:11:26
问题 I'm ready to try out my TensorFlow Serving REST API based on a saved model, and was wondering if there was an easy way to generate the JSON instances (row-based) or inputs (columnar) I need to send with my request. I have several thousand features in my model and I would hate to manually type in a JSON. Is there a way I can use existing data to come up with serialized data I can throw at the predict API? I'm using TFX for the entire pipeline (incl. tf.Transform), so I'm not sure if there is a