keras LSTM model - a tf 1.15 equivalent that works with tflite

问题

TLDR: How to implement this model using tf.lite.experimental.nn.TFLiteLSTMCell, tf.lite.experimental.nn.dynamic_rnninstead keras.layers.LSTM?

I have this network in keras:

inputs = keras.Input(shape=(1, 52))
state_1_h = keras.Input(shape=(200,))
state_1_c = keras.Input(shape=(200,))
x1, state_1_h_out, state_1_c_out = layers.LSTM(200, return_sequences=True, input_shape=(sequence_length, 52),
                                               return_state=True)(inputs, initial_state=[state_1_h, state_1_c])
output = layers.Dense(13)(x1)

model = keras.Model([inputs, state_1_h, state_1_c],
                    [output, state_1_h_out, state_1_c_out])

I need to implement it in tensorflow 1.15, but in a way that will be compatible with tflite 1.15. ** It means that I cannot use keras.layers.LSTM because it is not compatible with tflite 1.15. **

Following this examples, I saw the tutorial: https://github.com/tensorflow/tensorflow/tree/r1.15/tensorflow/lite/experimental/examples/lstm https://github.com/tensorflow/tensorflow/blob/r1.15/tensorflow/lite/experimental/examples/lstm/TensorFlowLite_LSTM_Keras_Tutorial.ipynb

Which explains how to implement LSTM in a way that is compatible with tflite 1.15. I understand I need to use the following layers: tf.lite.experimental.nn.TFLiteLSTMCell, tf.lite.experimental.nn.dynamic_rnn

The hard part is this line:

x1, state_1_h_out, state_1_c_out = layers.LSTM(200, return_sequences=True, input_shape=(sequence_length, 52),
                                               return_state=True)(inputs, initial_state=[state_1_h, state_1_c])

I follow the docs to implement it:

The dynamic_rnn documentation explains how to provide initial state to the dynamic rnn.

I try to use it in the buildLstmLayer function provided (that should implement LSTM):

def buildLstmLayer(inputs, num_layers, num_units):
  """Build the lstm layer.

  Args:
    inputs: The input data.
    num_layers: How many LSTM layers do we want.
    num_units: The unmber of hidden units in the LSTM cell.
  """
  lstm_cells = []
  for i in range(num_layers):
    lstm_cells.append(
        tf.lite.experimental.nn.TFLiteLSTMCell(
            num_units, forget_bias=0, name='rnn{}'.format(i)))
  lstm_layers = tf.keras.layers.StackedRNNCells(lstm_cells)
  # Assume the input is sized as [batch, time, input_size], then we're going
  # to transpose to be time-majored.
  transposed_inputs = tf.transpose(
      inputs, perm=[1, 0, 2])
  outputs, _ = tf.lite.experimental.nn.dynamic_rnn(
      lstm_layers,
      transposed_inputs,
      dtype='float32',
      time_major=True)
  unstacked_outputs = tf.unstack(outputs, axis=0)
  return unstacked_outputs[-1]

This is my code:

import os
os.environ['TF_ENABLE_CONTROL_FLOW_V2'] = '1'
from tensorflow.keras import Model

import tensorflow as tf
print(f"tf version: {tf.__version__}, tf.keras version: {tf.keras.__version__}")
from tensorflow.keras.utils import plot_model

def buildLstmLayer(merged_inputs, num_units):
  inputs = merged_inputs[0]
  state_1_h_keras = merged_inputs[1]
  state_1_c_keras = merged_inputs[2]
  initial_state = tf.nn.rnn_cell.LSTMStateTuple(state_1_h_keras, state_1_c_keras)
  cell = tf.lite.experimental.nn.TFLiteLSTMCell(num_units, state_is_tuple=True)

  outputs, out_states = tf.lite.experimental.nn.dynamic_rnn(
      cell,
      inputs,
      dtype='float32',
      time_major=True,
      initial_state=initial_state)
  state_1_h_out, state_1_c_out = out_states
  return outputs, state_1_h_out, state_1_c_out

tf.reset_default_graph()

inputs = tf.keras.layers.Input(shape=(1, 52), name='input')
batch_size = tf.shape(inputs)[1]
cell = tf.nn.rnn_cell.BasicLSTMCell(200, state_is_tuple=True)
initial_state = cell.zero_state(batch_size, tf.float32)
state_1_h, state_1_c = initial_state
state_1_h_keras = tf.keras.Input(tensor=(state_1_h), name='state_1_h')
state_1_c_keras = tf.keras.Input(tensor=(state_1_c), name='state_1_c')
x1, state_1_h_out, state_1_c_out = tf.keras.layers.Lambda(buildLstmLayer, arguments={'num_units': 200})([inputs, state_1_h_keras, state_1_c_keras])
output = tf.keras.layers.Dense(13, activation=tf.nn.softmax, name='output')(x1)
model = Model([inputs, state_1_h_keras, state_1_c_keras],
              [output, state_1_h_out, state_1_c_out])
sess = tf.keras.backend.get_session()
inputs_tensors = [sess.graph.get_tensor_by_name(tensor_name) for tensor_name in [x.name for x in model.inputs]]
outputs_tensors = [sess.graph.get_tensor_by_name(tensor_name) for tensor_name in [x.name for x in model.outputs]]

converter = tf.lite.TFLiteConverter.from_session(
    sess, inputs_tensors, outputs_tensors)
tflite_model = converter.convert()
print('Model converted successfully!')

The model seems to be the exact same model:

However, the converter.convert() line returns:

Specified output array "lambda_1/lambda_1/Identity" is not produced by any op in this graph. Is it a typo? This should not happen

This output array is: state_1_h_out . It means that the state that is being returned by the dynamic_rnn layer, is not recognized by the operation in the graph.

By not using the output states:

model = Model([inputs, state_1_h_keras, state_1_c_keras],
               [output])

The code works, it converts to tflite, and it is even loaded in the device!

It means that the current problem is the output states that should be returned by the LSTM. I’ve tried to hack it by:

all_lambda_outputs = tf.keras.layers.Lambda(buildLstmLayer, arguments={'num_units': 200})([inputs, state_1_h_keras, state_1_c_keras])
x1 = all_lambda_outputs[0]
state_1_h_out, state_1_c_out = tf.keras.layers.Lambda(lambda tup: (tup[0], tup[1]))(all_lambda_outputs[1])

But it still don’t work.

How can I solve it?

Thank you

来源：https://stackoverflow.com/questions/66063680/keras-lstm-model-a-tf-1-15-equivalent-that-works-with-tflite

标签

tensorflow