DataLayer placement in the .prototxt file generated by Shai's LSTM implementation

爱⌒轻易说出口 提交于 2019-12-11 11:10:31

问题


Regarding the answer provided by @Shai in LSTM module for Caffe, where caffe.NetSpec() is used to explicitly unroll LSTM units in time for training.

Using this code implementation, why does the "DummyData" layer, or any data layer used instead as input X, appears at the end of the t0 time step, just before "t1/lstm/Mx" in the prototxt file? I don't get it...

A manipulation (cut / paste) is hence needed.


回答1:


Shai's NetSpec implementation of LSTM unrolls the net in time. Hence for every time step there is an LSTM unit with shared weights across time steps.
The "bottom" for each unit in time (e.g. t1/lstm/Mx) is a different time step of the input X.

By the way, I suggest you use draw_net.py caffe utility to draw the resulting prototxt and see the flow of data and the temporal repetitions of the unrolled LSTM unit.

Here's how the unrolled net looks like: You can see the components of the three LSTM cells, and the different temporal slices of X going to each temporal unrolled LSTM unit.



来源:https://stackoverflow.com/questions/36748063/datalayer-placement-in-the-prototxt-file-generated-by-shais-lstm-implementatio

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!