问题
I want to train a bi-directional LSTM in tensorflow to perform a sequence classification problem (sentiment classification).
Because sequences are of variable lengths, batches are normally padded with vectors of zero. Normally, I use the sequence_length parameter in the uni-directional RNN to avoid training on the padding vectors.
How can this be managed with bi-directional LSTM. Does the "sequence_length" parameter work automatically starts from an advanced position in the sequence for the backward direction?
Thank you
回答1:
bidirectional_dynamic_rnn
also has a sequence_length
parameter that takes care of sequences of variable lengths.
https://www.tensorflow.org/api_docs/python/tf/nn/bidirectional_dynamic_rnn (mirror):
sequence_length
: An int32/int64 vector, size [batch_size], containing the actual lengths for each of the sequences.
You can see an example here: https://github.com/Franck-Dernoncourt/NeuroNER/blob/master/src/entity_lstm.py
回答2:
In forward pass, rnn cell will stop at sequence_length
which is the no-padding length of the input and is a parameter in tf.nn.bidirectional_dynamic_rnn
. In backward pass, it firstly use function tf.reverse_sequence
to reverse the first sequence_length
elements and then traverse like that in the forward pass.
https://tensorflow.google.cn/api_docs/python/tf/reverse_sequence
This op first slices input along the dimension
batch_axis
, and for each slicei
, reverses the firstseq_lengths[i]
elements along the dimensionseq_axis
.
来源:https://stackoverflow.com/questions/42936717/bi-directional-lstm-for-variable-length-sequence-in-tensorflow