问题
I want to build an LSTM based neural network which takes two kinds of inputs and predicts two kinds of outputs. A rough structure can be seen in following figure..
The output 2
is dependent upon output 1
and as described in answer to a similar question here, I have tried to implement this by setting the initial state of LSTM 2 from hidden states of LSTM 1. I have implemented this using tensorflow using following code.
import tensorflow as tf
from tensorflow.keras.layers import Input
from tensorflow.keras.layers import LSTM
from tensorflow.keras.layers import Dense
import numpy as np
np.set_printoptions(suppress=True) # to suppress scientific notation while printing arrays
def reset_graph(seed=2):
tf.compat.v1.reset_default_graph()
tf.random.set_seed(seed) # tf.set_random_seed(seed)
np.random.seed(seed)
tf.__version__
seq_len = 10
in_features1 = 3
in_features2 = 5
batch_size = 2
units = 5
# define input data
data1 = np.random.normal(0,1, size=(batch_size, seq_len, in_features1))
print('input 1 shape is', data1.shape)
data2 = np.random.normal(0,1, size=(batch_size, seq_len, in_features2))
print('input 2 shape is', data2.shape)
reset_graph()
# define model
inputs1 = Input(shape=(seq_len, in_features1))
inputs2 = Input(shape=(seq_len, in_features2))
lstm1 = LSTM(units, return_state=True)
lstm1_out, lstm_h, lstm_c = lstm1(inputs1, initial_state=None)
dense1 = Dense(1)
dense1_out = dense1(lstm1_out)
lstm2 = LSTM(units)
lstm2_out = lstm2(inputs2, initial_state=[lstm_h, lstm_c])
dense2 = Dense(1)
dense2_out = dense2(lstm2_out)
The inputs to two LSTMs are not exactly same because some of the input 2
have nothing to do with output 1
but output 2
is definitely influenced by output 1
. For example output 1
is water flow and output 2
is water quality. So water quality is influenced by water flow.
This code runs fine but I am not sure if this code does what I intend to do i.e. working of LSTM 2
being influenced by output of LSTM 1
.
Question: Please verify if the implementation and reasoning is correct or wrong?
回答1:
A possible solution could be concatenating the output of LSTM1 to input2. As LSTM1 return a sequence (return_sequence=True
) you can just concatenate the output of the LSTM1 (seq_len, num_units)
to imput2 (seq_len, in_features2)
resulting in (seq_len, num_units + in_features2)
.
Something like this could work:
# define model
inputs1 = Input(shape=(seq_len, in_features1))
inputs2 = Input(shape=(seq_len, in_features2))
lstm1 = LSTM(units, return_state=True)
lstm1_out = lstm1(inputs1)
lstm2_input = tf.keras.layers.concatenate([inputs2, lstm1_out])
lstm2 = LSTM(units)
lstm2_out = lstm2(lstm2_input)
dense1 = Dense(1)
dense1_out = dense1(lstm1_out)
dense2 = Dense(1)
dense2_out = dense2(lstm2_out)
Hope it helps! :)
来源:https://stackoverflow.com/questions/62034413/using-output-from-one-lstm-as-input-into-another-lstm-in-tensorflow