Masking zero inputs in LSTM in keras without using embedding

落爺英雄遲暮 提交于 2019-12-24 08:26:11

问题


I am training an LSTM in Keras:

iclf = Sequential()
iclf.add(Bidirectional(LSTM(units=10, return_sequences=True, recurrent_dropout=0.3), input_shape=(None,2048)))
iclf.add(TimeDistributed(Dense(1, activation='sigmoid')))

The input to each cell is a 2048 vector which is known and need not to be learned (if you will, they are the ELMo embeddings of the words in the input sentences). Therefore, here I have not the Embedding layer.

Since the input sequences have variable lengths, they are padded using pad_sequences:

X = pad_sequences(sequences=X, padding='post', truncating='post', value=0.0, dtype='float32')

Now, I want to tell the LSTM to ignore these padded elements. The official way is to use the Embedding layer with mask_zero=True. But, here there is no Embedding layer. How can I inform the LSTM to mask zero elements?


回答1:


As @Today has suggested in the comment you can use the Masking layer. Here I added a toy problem.

# lstm autoencoder recreate sequence
from numpy import array
from keras.models import Sequential
from keras.layers import LSTM, Masking
from keras.layers import Dense
from keras.layers import RepeatVector
from keras.layers import TimeDistributed
from keras.utils import plot_model

# define input sequence
sequence = array([[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9], 
                  [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8],
                  [0.3, 0.4, 0.5, 0.6]])
# make sure to use dtype='float32' in padding otherwise with floating points
sequence = pad_sequences(sequence, padding='post', dtype='float32')


# reshape input into [samples, timesteps, features]
n_obs = len(sequence)
n_in = 9
sequence = sequence.reshape((n_obs, n_in, 1))

# define model
model = Sequential()
model.add(Masking(mask_value=0, input_shape=(n_in, 1)))
model.add(LSTM(100, activation='relu', input_shape=(n_in,1) ))
model.add(RepeatVector(n_in))
model.add(LSTM(100, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(1)))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(sequence, sequence, epochs=300, verbose=0)
plot_model(model, show_shapes=True, to_file='reconstruct_lstm_autoencoder.png')
# demonstrate recreation
yhat = model.predict(sequence, verbose=0)
print(yhat[0,:,0])


来源:https://stackoverflow.com/questions/53172852/masking-zero-inputs-in-lstm-in-keras-without-using-embedding

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!