What is the right way to use TensorFlow for real time predictions in a high traffic application.
Ideally I would have a server/cluster running tensorflow listening o
This morning, our colleagues released TensorFlow Serving on GitHub, which addresses some of the use cases that you mentioned. It is a distributed wrapper for TensorFlow that is designed to support high-performance serving of multiple models. It supports both bulk processing and interactive requests from app servers.
For more information, see the basic and advanced tutorials.