问题
I am building a basic seq2seq autoencoder, but I'm not sure if I'm doing it correctly.
model = Sequential()
# Encoder
model.add(LSTM(32, activation='relu', input_shape =(timesteps, n_features ), return_sequences=True))
model.add(LSTM(16, activation='relu', return_sequences=False))
model.add(RepeatVector(timesteps))
# Decoder
model.add(LSTM(16, activation='relu', return_sequences=True))
model.add(LSTM(32, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(n_features)))'''
The model is then fit using a batch size parameter
model.fit(data, data,
epochs=30,
batch_size = 32)
The model is compiled with the mse
loss function and seems to learn.
To get the encoder output for the test data, I am using a K function:
get_encoder_output = K.function([model.layers[0].input],
[model.layers[1].output])
encoder_output = get_encoder_output([test_data])[0]
My first question is whether the model is specified correctly. In particular whether the RepeatVector layer is needed. I'm not sure what it is doing. What if I omit it and specify the preceding layer with return_sequences = True
?
My second question is whether I need to tell get_encoder_output
about the batch_size
used in training?
Thanks in advance for any help on either question.
回答1:
This might prove useful to you:
As a toy problem I created a seq2seq model for predicting the continuation of different sine waves.
This was the model:
def create_seq2seq():
features_num=5
latent_dim=40
##
encoder_inputs = Input(shape=(None, features_num))
encoded = LSTM(latent_dim, return_state=False ,return_sequences=True)(encoder_inputs)
encoded = LSTM(latent_dim, return_state=False ,return_sequences=True)(encoded)
encoded = LSTM(latent_dim, return_state=False ,return_sequences=True)(encoded)
encoded = LSTM(latent_dim, return_state=True)(encoded)
encoder = Model (input=encoder_inputs, output=encoded)
##
encoder_outputs, state_h, state_c = encoder(encoder_inputs)
encoder_states = [state_h, state_c]
decoder_inputs=Input(shape=(1, features_num))
decoder_lstm_1 = LSTM(latent_dim, return_sequences=True, return_state=True)
decoder_lstm_2 = LSTM(latent_dim, return_sequences=True, return_state=True)
decoder_lstm_3 = LSTM(latent_dim, return_sequences=True, return_state=True)
decoder_lstm_4 = LSTM(latent_dim, return_sequences=True, return_state=True)
decoder_dense = Dense(features_num)
all_outputs = []
inputs = decoder_inputs
states_1=encoder_states
# Placeholder values:
states_2=states_1; states_3=states_1; states_4=states_1
###
for _ in range(1):
# Run the decoder on the first timestep
outputs_1, state_h_1, state_c_1 = decoder_lstm_1(inputs, initial_state=states_1)
outputs_2, state_h_2, state_c_2 = decoder_lstm_2(outputs_1)
outputs_3, state_h_3, state_c_3 = decoder_lstm_3(outputs_2)
outputs_4, state_h_4, state_c_4 = decoder_lstm_4(outputs_3)
# Store the current prediction (we will concatenate all predictions later)
outputs = decoder_dense(outputs_4)
all_outputs.append(outputs)
# Reinject the outputs as inputs for the next loop iteration
# as well as update the states
inputs = outputs
states_1 = [state_h_1, state_c_1]
states_2 = [state_h_2, state_c_2]
states_3 = [state_h_3, state_c_3]
states_4 = [state_h_4, state_c_4]
for _ in range(149):
# Run the decoder on each timestep
outputs_1, state_h_1, state_c_1 = decoder_lstm_1(inputs, initial_state=states_1)
outputs_2, state_h_2, state_c_2 = decoder_lstm_2(outputs_1, initial_state=states_2)
outputs_3, state_h_3, state_c_3 = decoder_lstm_3(outputs_2, initial_state=states_3)
outputs_4, state_h_4, state_c_4 = decoder_lstm_4(outputs_3, initial_state=states_4)
# Store the current prediction (we will concatenate all predictions later)
outputs = decoder_dense(outputs_4)
all_outputs.append(outputs)
# Reinject the outputs as inputs for the next loop iteration
# as well as update the states
inputs = outputs
states_1 = [state_h_1, state_c_1]
states_2 = [state_h_2, state_c_2]
states_3 = [state_h_3, state_c_3]
states_4 = [state_h_4, state_c_4]
# Concatenate all predictions
decoder_outputs = Lambda(lambda x: K.concatenate(x, axis=1))(all_outputs)
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
#model = load_model('pre_model.h5')
print(model.summary()
return (model)
回答2:
The best way, in my opinion, to implement a seq2seq LSTM in Keras, is by using 2 LSTM models and having the first one transfer its states to the second one.
Your last LSTM layer in the encoder will need
return_state=True ,return_sequences=False
so it will pass on its h
and c
.
You will then need to set an LSTM decoder that will receive these as it's initial_state
.
For decoder input you will most likely want a "start of sequence" token as the first time step input, and afterwards use the decoder output of the nth
time step as the input of the the decoder in the (n+1)th
time step.
After you have mastered this, have a look at Teacher Forcing.
来源:https://stackoverflow.com/questions/58266407/specifying-a-seq2seq-autoencoder-what-does-repeatvector-do-and-what-is-the-eff