问题
I implemented a Sequence to Sequence Encoder Decoder but I am having problems with varying my target length in the prediction. It is working for the same length of the training sequence but not if it is different. What do I need to change ?
from keras.models import Model
from keras.layers import Input, LSTM, Dense
import numpy as np
num_encoder_tokens = 2
num_decoder_tokens = 2
encoder_seq_length = None
decoder_seq_length = None
batch_size = 100
epochs = 2000
hidden_units=10
timesteps=10
input_seqs = np.random.random((1000, 10, num_encoder_tokens))
target_seqs = np.random.random((1000, 10, num_decoder_tokens))
#define training encoder
encoder_inputs = Input(shape=(None, num_encoder_tokens))
encoder = LSTM(hidden_units, return_state=True)
encoder_outputs, state_h, state_c = encoder(encoder_inputs)
encoder_states = [state_h, state_c]
#define training decoder
decoder_inputs = Input(shape=(None,num_decoder_tokens))
decoder_lstm = LSTM(hidden_units, return_sequences=True, return_state=True)
decoder_outputs, _, _ = decoder_lstm(decoder_inputs, initial_state=encoder_states)
decoder_dense = Dense(num_encoder_tokens, activation='tanh')
decoder_outputs = decoder_dense(decoder_outputs)
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
#Run training
model.compile(optimizer='adam', loss='mse')
model.fit([input_seqs, target_seqs], target_seqs,batch_size=batch_size, epochs=epochs)
#new target data
target_seqs = np.random.random((2000, 10, num_decoder_tokens))
# define inference encoder
encoder_model = Model(encoder_inputs, encoder_states)
# define inference decoder
decoder_state_input_h = Input(shape=(hidden_units,))
decoder_state_input_c = Input(shape=(hidden_units,))
decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]
decoder_outputs, state_h, state_c = decoder_lstm(decoder_inputs, initial_state=decoder_states_inputs)
decoder_states = [state_h, state_c]
decoder_outputs = decoder_dense(decoder_outputs)
decoder_model = Model([decoder_inputs] + decoder_states_inputs, [decoder_outputs] + decoder_states)
# Initalizse states
states_values = encoder_model.predict(input_seqs)
and here it wants the same batchsize as in the input_seqs and does not accept target_seqs having a batch of 2000
target_seq = np.zeros((1, 1, num_decoder_tokens))
output=list()
for t in range(timesteps):
output_tokens, h, c = decoder_model.predict([target_seqs] + states_values)
output.append(output_tokens[0,0,:])
states_values = [h,c]
target_seq = output_tokens
What do I need to change that the model accepts a variable length of input ?
回答1:
Unfortunately you cannot do that. You have to set your input to the maximum expected length. Then you can use a Masking layer with either an Embedding layer or using a masking value as
keras.layers.Masking(mask_value=0.0)
See more information here.
回答2:
You can create in your data a word/token that means end_of_sequence
.
You keep the length to a maximum and probably use some Masking(mask_value)
layer to avoid processing undesired steps.
In both the inputs and outputs, you add the end_of_sequence
token and complete the missing steps with mask_value
.
Example:
- the longest sequence has 4 steps
- make it 5 to add an
end_of_sequence
token:[step1, step2, step3, step4, end_of_sequence]
- make it 5 to add an
- consider a sequence that is shorter:
[step1, step2, end_of_sequence, mask_value, mask_value]
Then your shape will be (batch, 5, features)
.
Another approach is described in your other question, where the user loops each step manually and checks whether the result of this step is the end_of_sequence
token: Difference between two Sequence to Sequence Models keras (with and without RepeatVector)
If this is an autoencoder, there is also another possibility for variable lengths, where you take the length directly from the input (must feed batches with only one sequence each, no padding/masking): How to apply LSTM-autoencoder to variant-length time-series data?
This is another approach where we store the input length explicitly in a reserved element of the latent vector and later we read this (must also run with only one sequence per batch, no padding): Variable length output in keras
来源:https://stackoverflow.com/questions/51501726/variable-input-for-sequence-to-sequence-autoencoder