RNN: Back-propagation through time when output is taken only at final timestep

问题

In this blog on Recurrent Neural Networks by Denny Britz. Author states that, "The above diagram has outputs at each time step, but depending on the task this may not be necessary. For example, when predicting the sentiment of a sentence we may only care about the final output, not the sentiment after each word. Similarly, we may not need inputs at each time step."

In the case when we take output only at the final timestep: How will backpropogation change, if there are no outputs at each time step, only the final one? We need to define loss at each time step, but how to do it without outputs?

回答1:

This is not true that you "need to define output at each timestep", actually backpropagation through time is simpler with a single output than the one on the image. When there is just one output simply "rotate your network 90 degrees" and it will be a regular feed forward network (simply with some signals coming into hidden layers directly) - backpropagation works as usually, pushing the partial derivatives through the system. When we have outputs at each step, this becomes more tricky and usually you define true loss to be sum of all the "small losses" and consequently you have to sum all the gradients.

来源：https://stackoverflow.com/questions/42725726/rnn-back-propagation-through-time-when-output-is-taken-only-at-final-timestep

标签

machine-learning

neural-network

recurrent-neural-network

backpropagation

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!