How to handle extremely long LSTM sequence length?

前端 未结 3 537
死守一世寂寞
死守一世寂寞 2021-02-09 05:11

I have some data that is sampled at at a very high rate (on the order of hundreds of times per second). This results in a sequence length that is huge (~90,000 samples) on avera

3条回答
  •  猫巷女王i
    2021-02-09 05:37

    When you have very long sequences RNNs can face the problem of vanishing gradients and exploding gradients.

    There are methods. The first thing you need to understand is why we need to try above methods? It's because back propagation through time can get real hard due to above mentioned problems.

    Yes introduction of LSTM has reduced this by very large margin but still when it's is so long you can face such problems.

    So one way is clipping the gradients. That means you set an upper bound to gradients. Refer to this stackoverflow question

    Then this problem you asked

    What are some methods to effectively 'chunk' these sequences?

    One way is truncated back propagation through time. There are number of ways to implement this truncated BPTT. Simple idea is

    1. Calculate the gradients only for number of given time steps That means if your sequence is 200 time steps and you only give 10 time steps it will only calculate gradient for 10 time step and then pass the stored memory value in that 10 time step to next sequence(as the initial cell state) . This method is what tensorflow using to calculate truncated BPTT.

    2.Take the full sequence and only back propagate gradients for some given time steps from selected time block. It's a continuous way

    Here is the best article I found which explains these trunacated BPTT methods. Very easy. Refer to this Styles of Truncated Backpropagation

提交回复
热议问题