问题

Context

I have a colab with a very simple demo Estimator for the purpose of learning / understanding the Estimator API with the goal of making a convention for a plug-and-play model with useful bells and whistles of the trade in tack (e.g. early stopping if the validation set stops improving, exporting the model, etc).

Each of the three Estimator modes (TRAIN, EVAL, and PREDICT) return an EstimatorSpec.

According to the docs:

__new__(
    cls,
    mode,
    predictions=None,          # required by PREDICT
    loss=None,                 # required by TRAIN and EVAL
    train_op=None,             # required by TRAIN
    eval_metric_ops=None,
    export_outputs=None,
    training_chief_hooks=None,
    training_hooks=None,
    scaffold=None,
    evaluation_hooks=None,
    prediction_hooks=None.     
)

Of these named arguments I would like to bring attention to predictions and export_outputs, which are described in the docs as:

predictions: Predictions Tensor or dict of Tensor.

export_outputs: Describes the output signatures to be exported to SavedModel and used during serving. A dict {name: output} where:
name: An arbitrary name for this output.

output: an ExportOutput object such as ClassificationOutput, RegressionOutput, or PredictOutput. Single-headed models only need to specify one entry in this dictionary. Multi-headed models should specify one entry for each head, one of which must be named using signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY. If no entry is provided, a default PredictOutput mapping to predictions will be created.

Thus it should be clear why I bring up export_outputs; namely, as one would most likely like to use the model they trained in the future (by loading it from a SavedModel).

To make this question a bit more accessible / add some clarity:

"single-headed" models are the most common model one encounters where the input_fn features are transformed to a singular (batched) output
"multi-headed" models are models where there is more than one output

e.g. this multi-headed model's input_fn (in accordance with the Estimator api) returns a tuple (features, labels) i.e. this model has two heads).

def input_fn():
  features = ...
  labels1 = ...
  labels2 = ...
  return features, {'head1': labels1, 'head2': labels2}

How one specifies the signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY is the core of this question. Namely, how does one specify it? (e.g. should it be a dict {signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: head})

Right, so in the colab you see that our model's export_outputs is actually defined in a multi-head manner (although it shouldn't be):

From estimator functions > model_fn of the colab:

def model_fn(...):

    # ...

    # send the features through the graph
    MODEL = build_fn(MODEL)

    # prediction
    MODEL['predictions'] = {'labels': MODEL['net_logits']} # <--- net_logits added in the build_fn

    MODEL['export_outputs'] = {
        k: tf.estimator.export.PredictOutput(v) for k, v in MODEL['predictions'].items()
    }

    # ...

in this particular instance, if we expand the dictionary comprehension, we have the functional equivalent of:

MODEL['export_outputs'] = {
    'labels': tf.estimator.export.PredictOutput(MODEL['net_logits'])
}

It works in this instance as our dictionary has one key and hence one PredictOutput, where in the colab our model_fn has only a single head and would be more properly formatted as:

MODEL['export_outputs'] = {
    'predictions': tf.estimator.export.PredictOutput(MODEL['predictions'])
}

as it states in PredictOutput:

__init__(outputs)

where

outputs: A Tensor or a dict of string to Tensor representing the predictions.

Question

Thus my question is as follows:

if PredictOutput can be a dictionary, when / why would one want multiple PredictOutputs as their export_outputs for the EstimatorSpec?
If one has a multi-headed model, (say with multiple PredictOutputs) how does one actually specify the signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY
what is the point of predictions in the EstimatorSpec when it is also "required" (for anyone who cares about using SavedModels) in export_outputs?

回答1:

Thanks for your detailed question; you have clearly dug deep here.

There are also classes for RegressionOutput and ClassificationOutput which cannot be dictionaries. The use of an export_outputs dict allows for generalizations over those use cases.
The head you want to be served by default from the saved model should take the default signature key. For example:

export_outputs = {
  signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY:
    PredictOutput(outputs={'some_output_1': output_1}),
  'head-2': PredictOutput(outputs={'some_output_2': output_2}),
  'head-3': PredictOutput(outputs={'some_output_3': output_3})
}

Reason 1: Many people use the default export_outputs (which is in turn the value of predictions), or don't export to saved model. Reason 2: History. Predictions came first, and over time more and more features were added. These features required flexibility and extra info, and were therefore independently packed into the EstimatorSpec.

来源：https://stackoverflow.com/questions/53414168/tensorflow-exportoutputs-predictouput-and-specifying-signature-constants-defau

标签

python