问题
TLDR: How to implement this model using tf.lite.experimental.nn.TFLiteLSTMCell, tf.lite.experimental.nn.dynamic_rnn
instead keras.layers.LSTM
?
I have this network in keras:
inputs = keras.Input(shape=(1, 52))
state_1_h = keras.Input(shape=(200,))
state_1_c = keras.Input(shape=(200,))
x1, state_1_h_out, state_1_c_out = layers.LSTM(200, return_sequences=True, input_shape=(sequence_length, 52),
return_state=True)(inputs, initial_state=[state_1_h, state_1_c])
output = layers.Dense(13)(x1)
model = keras.Model([inputs, state_1_h, state_1_c],
[output, state_1_h_out, state_1_c_out])
I need to implement it in tensorflow 1.15, but in a way that will be compatible with tflite 1.15.
** It means that I cannot use keras.layers.LSTM
because it is not compatible with tflite 1.15. **
Following this examples, I saw the tutorial: https://github.com/tensorflow/tensorflow/tree/r1.15/tensorflow/lite/experimental/examples/lstm https://github.com/tensorflow/tensorflow/blob/r1.15/tensorflow/lite/experimental/examples/lstm/TensorFlowLite_LSTM_Keras_Tutorial.ipynb
Which explains how to implement LSTM in a way that is compatible with tflite 1.15.
I understand I need to use the following layers: tf.lite.experimental.nn.TFLiteLSTMCell, tf.lite.experimental.nn.dynamic_rnn
The hard part is this line:
x1, state_1_h_out, state_1_c_out = layers.LSTM(200, return_sequences=True, input_shape=(sequence_length, 52),
return_state=True)(inputs, initial_state=[state_1_h, state_1_c])
I follow the docs to implement it:
The dynamic_rnn documentation explains how to provide initial state to the dynamic rnn.
I try to use it in the buildLstmLayer
function provided (that should implement LSTM):
def buildLstmLayer(inputs, num_layers, num_units):
"""Build the lstm layer.
Args:
inputs: The input data.
num_layers: How many LSTM layers do we want.
num_units: The unmber of hidden units in the LSTM cell.
"""
lstm_cells = []
for i in range(num_layers):
lstm_cells.append(
tf.lite.experimental.nn.TFLiteLSTMCell(
num_units, forget_bias=0, name='rnn{}'.format(i)))
lstm_layers = tf.keras.layers.StackedRNNCells(lstm_cells)
# Assume the input is sized as [batch, time, input_size], then we're going
# to transpose to be time-majored.
transposed_inputs = tf.transpose(
inputs, perm=[1, 0, 2])
outputs, _ = tf.lite.experimental.nn.dynamic_rnn(
lstm_layers,
transposed_inputs,
dtype='float32',
time_major=True)
unstacked_outputs = tf.unstack(outputs, axis=0)
return unstacked_outputs[-1]
This is my code:
import os
os.environ['TF_ENABLE_CONTROL_FLOW_V2'] = '1'
from tensorflow.keras import Model
import tensorflow as tf
print(f"tf version: {tf.__version__}, tf.keras version: {tf.keras.__version__}")
from tensorflow.keras.utils import plot_model
def buildLstmLayer(merged_inputs, num_units):
inputs = merged_inputs[0]
state_1_h_keras = merged_inputs[1]
state_1_c_keras = merged_inputs[2]
initial_state = tf.nn.rnn_cell.LSTMStateTuple(state_1_h_keras, state_1_c_keras)
cell = tf.lite.experimental.nn.TFLiteLSTMCell(num_units, state_is_tuple=True)
outputs, out_states = tf.lite.experimental.nn.dynamic_rnn(
cell,
inputs,
dtype='float32',
time_major=True,
initial_state=initial_state)
state_1_h_out, state_1_c_out = out_states
return outputs, state_1_h_out, state_1_c_out
tf.reset_default_graph()
inputs = tf.keras.layers.Input(shape=(1, 52), name='input')
batch_size = tf.shape(inputs)[1]
cell = tf.nn.rnn_cell.BasicLSTMCell(200, state_is_tuple=True)
initial_state = cell.zero_state(batch_size, tf.float32)
state_1_h, state_1_c = initial_state
state_1_h_keras = tf.keras.Input(tensor=(state_1_h), name='state_1_h')
state_1_c_keras = tf.keras.Input(tensor=(state_1_c), name='state_1_c')
x1, state_1_h_out, state_1_c_out = tf.keras.layers.Lambda(buildLstmLayer, arguments={'num_units': 200})([inputs, state_1_h_keras, state_1_c_keras])
output = tf.keras.layers.Dense(13, activation=tf.nn.softmax, name='output')(x1)
model = Model([inputs, state_1_h_keras, state_1_c_keras],
[output, state_1_h_out, state_1_c_out])
sess = tf.keras.backend.get_session()
inputs_tensors = [sess.graph.get_tensor_by_name(tensor_name) for tensor_name in [x.name for x in model.inputs]]
outputs_tensors = [sess.graph.get_tensor_by_name(tensor_name) for tensor_name in [x.name for x in model.outputs]]
converter = tf.lite.TFLiteConverter.from_session(
sess, inputs_tensors, outputs_tensors)
tflite_model = converter.convert()
print('Model converted successfully!')
The model seems to be the exact same model:
However, the converter.convert()
line returns:
Specified output array "lambda_1/lambda_1/Identity" is not produced by any op in this graph. Is it a typo? This should not happen
This output array is: state_1_h_out . It means that the state that is being returned by the dynamic_rnn layer, is not recognized by the operation in the graph.
By not using the output states:
model = Model([inputs, state_1_h_keras, state_1_c_keras],
[output])
The code works, it converts to tflite, and it is even loaded in the device!
It means that the current problem is the output states that should be returned by the LSTM. I’ve tried to hack it by:
all_lambda_outputs = tf.keras.layers.Lambda(buildLstmLayer, arguments={'num_units': 200})([inputs, state_1_h_keras, state_1_c_keras])
x1 = all_lambda_outputs[0]
state_1_h_out, state_1_c_out = tf.keras.layers.Lambda(lambda tup: (tup[0], tup[1]))(all_lambda_outputs[1])
But it still don’t work.
How can I solve it?
Thank you
来源:https://stackoverflow.com/questions/66063680/keras-lstm-model-a-tf-1-15-equivalent-that-works-with-tflite