Cannot stack LSTM with MultiRNNCell and dynamic_rnn

橙三吉。 提交于 2019-12-03 07:42:37

This is a very interesting question. Initially, I thought that two codes produce the same output (i.e stacking two LSTM cells).

code 1

cell = tf.contrib.rnn.LSTMCell(hidden, state_is_tuple=True)  
cell = tf.contrib.rnn.MultiRNNCell([cell] * num_layers,state_is_tuple=True)
print(cell) 

code 2

cell = []
for i in range(num_layers):
    cell.append(tf.contrib.rnn.LSTMCell(hidden, state_is_tuple=True))
cell = tf.contrib.rnn.MultiRNNCell(cell,state_is_tuple=True)
print(cell) 

However, If you print the cell in both instances produce something like following,

code 1

[<tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x000000000D7084E0>, <tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x000000000D7084E0>]

code 2

[<tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x000000000D7084E0>, <tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x000000000D708B00>]

If you closely observe the results,

  • For code 1, prints a list of two LSTM cell objects and one object is the copy of other (since the pointers of the two objects are same)
  • For code 2 prints a list of two different LSTM cell objects (since the pointers of two objects are different).

Stacking two LSTM cells is something like below,

Therefore, If you think about the big picture (actual Tensorflow operation may be different), what it does is,

  1. First map inputs to LSTM cell 1 hidden units (in your case 14 to 128).
  2. Second, map hidden units of LSTM cell 1 to hidden units of LSTM cell 2 (in your case 128 to 128) .

Therefore, when you trying to do the above two operations to the same copy of LSTM cell (since the dimensions of weight matrices are different), there is an error.

However, if you use the number of hidden units as same the number input units (in your case input is 14 and hidden is 14) there is no error (since the dimensions of weight matrices are the same) although you are using the same LSTM cell.

Therefore, I think your second approach is correct if you are thinking of stacking two LSTM cells.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!