TensorFlow: How and why to use SavedModel

前端 未结 1 2021
太阳男子
太阳男子 2020-12-22 20:12

I have a few questions regarding the SavedModel API, whose documentation I find leaves a lot of details unexplained.

The first three questions are about

相关标签:
1条回答
  • 2020-12-22 20:40

    EDIT: I wrote this back at TensorFlow 1.4. As of today (TensorFlow 1.12 is stable, there's a 1.13rc and 2.0 is around the corner) the docs linked in the question are much improved.


    I'm trying to use tf.saved_model and also found the Docs quite (too) abstract. Here's my stab at a full answer to your questions:

    1. signature_def_map:

    a. Format See Tom's answer to Tensorflow: how to save/restore a model. (Ctrl-F for "tf.saved_model" - currently, the only uses of the phrase on that question are in his answer).

    b. need It's my understanding that you do normally need it. If you intend to use the model, you need to know the inputs and outputs of the graph. I think it is akin to a C++ function signature: If you intend to define a function after it's called or in another C++ file, you need the signature in your main file (i.e. prototyped or in a header file).

    2. assets_collection:

    format: Couldn't find clear documentation, so I went to the builder source code. It appears that the argument is an iterable of Tensors of dtype=tf.string, where each Tensor is a path for the asset directory. So, a TensorFlow Graph collection should work. I guess that is the parameter's namesake, but from the source code I would expect a Python list to work too.

    (You didn't ask if you need to set it, but judging from Zoe's answer to What are assets in tensorflow? and iga's answer to the tangentially related Tensorflow serving: “No assets to save/writes” when exporting models, it doesn't usually need set.)

    3. Tags:

    a. Why list I don't know why you must pass a list, but you may pass a list with one element. For instance, in my current project I only use the [tf...tag_constants.SERVING] tag.

    b. When to use multiple Say you're using explicit device placement for operations. Maybe you want to save a CPU version and a GPU version of your graph. Obviously you want to save a serving version of each, and say you want to save training checkpoints. You could use a CPU/GPU tag and a training/serving tag to manage all cases. The docs hint at it:

    Each MetaGraphDef added to the SavedModel must be annotated with user-specified tags. The tags provide a means to identify the specific MetaGraphDef to load and restore, along with the shared set of variables and assets. These tags typically annotate a MetaGraphDef with its functionality (for example, serving or training), and optionally with hardware-specific aspects (for example, GPU).

    c. Collision Too lazy to force a collision myself - I see two cases that would need addressed - I went to the loader source code. Inside def load, you'll see:

    saved_model = _parse_saved_model(export_dir)
    found_match = False
    for meta_graph_def in saved_model.meta_graphs:
      if set(meta_graph_def.meta_info_def.tags) == set(tags):
        meta_graph_def_to_load = meta_graph_def
        found_match = True
        break
    
    if not found_match:
      raise RuntimeError(
          "MetaGraphDef associated with tags " + str(tags).strip("[]") +
          " could not be found in SavedModel. To inspect available tag-sets in"
          " the SavedModel, please use the SavedModel CLI: `saved_model_cli`"
      )
    

    It appears to me that it's looking for an exact match. E.g. say you have a metagraph with tags "GPU" and "Serving" and a metagraph with tag "Serving". If you load "Serving", you'll get the latter metagraph. On the other hand, say you have a metagraph "GPU" and "Serving" and a metagraph "CPU" and "Serving". If you try to load "Serving", you'll get the error. If you try to save two metagraphs with the exact same tags in the same folder, I expect you'll overwrite the first one. It doesn't look like the build code handles such a collision in any special way.

    4. SavedModel or tf.train.Saver:

    This confused me too. wicke's answer to Should TensorFlow users prefer SavedModel over Checkpoint or GraphDef? cleared it up for me. I'll throw in my two cents:

    In the scope of local Python+TensorFlow, you can make tf.train.Saver do everything. But, it will cost you. Let me outline the save-a-trained-model-and-deploy use case. You'll need your saver object. It's easiest to set it up to save the complete graph (every variable). You probably don't want to save the .meta all the time since you're working with a static graph. You'll need to specify that in your training hook. You can read about that on cv-tricks. When your training finishes, you'll need convert your checkpoint file to a pb file. That usually means clearing the current graph, restoring the checkpoint, freezing your variables to constants with tf.python.framework.graph_util, and writing it with tf.gfile.GFile. You can read about that on medium. After that, you want to deploy it in Python. You'll need the input and output Tensor names - the string names in the graph def. You can read about that on metaflow (actually a very good blog post for the tf.train.Saver method). Some op nodes will let you feed data into them easily. Some not so much. I usually gave up on finding an appropriate node and added a tf.reshape that didn't actually reshape anything to the graph def. That was my ad-hoc input node. Same for the output. And then finally, you can deploy your model, at least locally in Python.

    Or, you could use the answer I linked in point 1 to accomplish all this with the SavedModel API. Less headaches thanks to Tom's answer . You'll get more support and features in the future if it ever gets documented appropriately . Looks like it's easier to use command line serving (the medium link covers doing that with Saver - looks tough, good luck!). It's practically baked in to the new Estimators. And according to the Docs,

    SavedModel is a language-neutral, recoverable, hermetic serialization format.

    Emphasis mine: Looks like you can get your trained models into the growing C++ API much easier.

    The way I see it, it's like the Datasets API. It's just easier than the old way!

    As far as concrete examples of SavedModel of tf.train.Saver: If "basically, when you want to save or restore your model" isn't clear enough for you: The correct time to use it is any time it makes your life easier. To me, that looks like always. Especially if you're using Estimators, deploying in C++, or using command line serving.

    So that's my research on your question. Or four enumerated questions. Err, eight question marks. Hope this helps.

    0 讨论(0)
提交回复
热议问题