How to apply Monte Carlo Dropout, in tensorflow, for an LSTM if batch normalization is part of the model?

问题

I have a model composed of 3 LSTM layers followed by a batch norm layer and finally dense layer. Here is the code:

def build_uncomplied_model(hparams):
    inputs = tf.keras.Input(shape=(None, hparams["n_features"]))
    x = return_RNN(hparams["rnn_type"])(hparams["cell_size_1"], return_sequences=True, recurrent_dropout=hparams['dropout'])(inputs)
    x = return_RNN(hparams["rnn_type"])(hparams["cell_size_2"], return_sequences=True)(x)
    x = return_RNN(hparams["rnn_type"])(hparams["cell_size_3"], return_sequences=True)(x)
    x = layers.BatchNormalization()(x)
    outputs = layers.TimeDistributed(layers.Dense(hparams["n_features"]))(x)

    model = tf.keras.Model(inputs, outputs, name=RNN_type + "_model")
    return model

Now I am aware that to apply MCDropout, we can apply the following code:

y_predict = np.stack([my_model(X_test, training=True) for x in range(100)])
y_proba = y_predict.mean(axis=0)

However, setting training = True will force the batch norm layer to overfit the testing dataset.

Additionally, building a custom Dropout layer while setting training to True isn't a solution in my case because I am using LSTM.

class MCDropout(tf.keras.layers.Dropout):
    def call(self, inputs):
        return super().call(inputs, training=True)

Any help is much appreciated!!

回答1:

A possible solution could be to create a custom LSTM layer. You should override the call method to force the training flag to be True

class MCLSTM(keras.layers.LSTM):
    def __init__(self, units, **kwargs):
        super(MCLSTM, self).__init__(units, **kwargs)
    def call(self, inputs, mask=None, training=None, initial_state=None):
        return super(MCLSTM, self).call(
            inputs,
            mask=mask,
            training=True,
            initial_state=initial_state,
        )

Then you can use it in your code

def build_uncomplied_model(hparams):
    inputs = tf.keras.Input(shape=(None, hparams["n_features"]))
    x = MCLSTM(hparams["cell_size_1"], return_sequences=True, recurrent_dropout=hparams['dropout'])(inputs)
    x = return_RNN(hparams["rnn_type"])(hparams["cell_size_2"], return_sequences=True)(x)
    x = return_RNN(hparams["rnn_type"])(hparams["cell_size_3"], return_sequences=True)(x)
    x = layers.BatchNormalization()(x)
    outputs = layers.TimeDistributed(layers.Dense(hparams["n_features"]))(x)

    model = tf.keras.Model(inputs, outputs, name=RNN_type + "_model")
    return model

or add it to your return_RNN factory (a more elegant way)

===== EDIT =====

Another solution could be to add the training flag when creating the model. Something like this:

def build_uncomplied_model(hparams):
    inputs = tf.keras.Input(shape=(None, hparams["n_features"]))
    # This the Monte Carlo LSTM
    x = LSTM(hparams["cell_size_1"], return_sequences=True, recurrent_dropout=hparams['dropout'])(inputs, training=True)
    x = return_RNN(hparams["rnn_type"])(hparams["cell_size_2"], return_sequences=True)(x)
    x = return_RNN(hparams["rnn_type"])(hparams["cell_size_3"], return_sequences=True)(x)
    x = layers.BatchNormalization()(x)
    outputs = layers.TimeDistributed(layers.Dense(hparams["n_features"]))(x)

    model = tf.keras.Model(inputs, outputs, name=RNN_type + "_model")
    return model

来源：https://stackoverflow.com/questions/62031302/how-to-apply-monte-carlo-dropout-in-tensorflow-for-an-lstm-if-batch-normalizat

标签

tensorflow

lstm

dropout