Hot load of models into tensorflow serving container

安稳与你 提交于 2019-12-04 21:36:47

You can.

First you need to copy the new model files to model_base_path you specified when launching the tf serve, so that the server can see the new model. The directory layout is usually this: $MODEL_BASE_PATH/$model_a/$version_a/* and $MODEL_BASE_PATH/$model_b/$version_b/*

Then you need to refresh the tf serve with a new model_config_file that includes the entry for the new model. See here on how to add entries to the model config file. To make the server take in the new config, there are two ways to do it:

  1. save the new config file and restart the tf serve.
  2. reload the new model config on the fly without restarting the tf serve. This service is defined in model_service.proto as HandleReloadConfigRequest, but the service's REST api does not seem to support it, so you need to rely on the gRPC API. Sadly the Python client for gRPC seems unimplemented. I managed to generate Java client code from protobuf files, but it is quite complex. An example here explains how to generate Java client code for doing gRPC inferencing, and doing handleReloadConfigRequest() is very similar.
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!