问题
I am training an LSTM in Keras:
iclf = Sequential()
iclf.add(Bidirectional(LSTM(units=10, return_sequences=True, recurrent_dropout=0.3), input_shape=(None,2048)))
iclf.add(TimeDistributed(Dense(1, activation='sigmoid')))
The input to each cell is a 2048 vector which is known and need not to be learned (if you will, they are the ELMo embeddings of the words in the input sentences). Therefore, here I have not the Embedding layer.
Since the input sequences have variable lengths, they are padded using pad_sequences
:
X = pad_sequences(sequences=X, padding='post', truncating='post', value=0.0, dtype='float32')
Now, I want to tell the LSTM to ignore these padded elements. The official way is to use the Embedding layer with mask_zero=True
. But, here there is no Embedding layer. How can I inform the LSTM to mask zero elements?
回答1:
As @Today has suggested in the comment you can use the Masking
layer. Here I added a toy problem.
# lstm autoencoder recreate sequence
from numpy import array
from keras.models import Sequential
from keras.layers import LSTM, Masking
from keras.layers import Dense
from keras.layers import RepeatVector
from keras.layers import TimeDistributed
from keras.utils import plot_model
# define input sequence
sequence = array([[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9],
[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8],
[0.3, 0.4, 0.5, 0.6]])
# make sure to use dtype='float32' in padding otherwise with floating points
sequence = pad_sequences(sequence, padding='post', dtype='float32')
# reshape input into [samples, timesteps, features]
n_obs = len(sequence)
n_in = 9
sequence = sequence.reshape((n_obs, n_in, 1))
# define model
model = Sequential()
model.add(Masking(mask_value=0, input_shape=(n_in, 1)))
model.add(LSTM(100, activation='relu', input_shape=(n_in,1) ))
model.add(RepeatVector(n_in))
model.add(LSTM(100, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(1)))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(sequence, sequence, epochs=300, verbose=0)
plot_model(model, show_shapes=True, to_file='reconstruct_lstm_autoencoder.png')
# demonstrate recreation
yhat = model.predict(sequence, verbose=0)
print(yhat[0,:,0])
来源:https://stackoverflow.com/questions/53172852/masking-zero-inputs-in-lstm-in-keras-without-using-embedding