发表新帖

发表新帖

Loss clipping in tensor flow (on DeepMind's DQN)

后端未结

关注

 4  525

渐次进展 2021-01-02 06:39

I am trying my own implementation of the DQN paper by Deepmind in tensor flow and am running into difficulty with clipping of the loss function.

Here is an excerpt

4条回答

走了就别回头了 (楼主)

2021-01-02 07:10
1. No. They talk about error clipping actually, not about loss clipping which is however as far as I know referring to the same thing but leads to confusion. They DO NOT mean that the loss below -1 is clipped to -1 and the loss above +1 is clipped to +1 because that leads to zero gradients outside the error range [-1;1] as you realized. Instead, they suggest to use a linear loss instead of a quadratic loss for error values < -1 and error values > 1.
2. Compute the error value (r + \gamma \max_{a'} Q(s',a'; \theta_i^-) - Q(s,a; \theta_i)). If this error value is within the range [-1;1], square it, if the error value is < -1 multiply by -1, if the error value is > 1 leave it as it is. If you use this as loss function the gradients outside the interval [-1;1] won't vanish.
In order to have a "smooth-looking" compound loss function you could also replace the squared loss outside the error range [-1;1] with a first-order Taylor approximation at the border values -1 and 1. In this case, if e was your error value, you would square it in case e \in [-1;1], in case e < -1, replace it by -2e-1, in case e > 1, replace it by 2e-1.
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...

热议问题