Should we do learning rate decay for adam optimizer

前端未结

关注

 4  1663

深忆病人 2021-01-29 19:10

I\'m training a network for image localization with Adam optimizer, and someone suggest me to use exponential decay. I don\'t want to try that because Adam optimizer itself deca

4条回答

孤独总比滥情好 (楼主)

2021-01-29 19:45
Yes, absolutely. From my own experience, it's very useful to Adam with learning rate decay. Without decay, you have to set a very small learning rate so the loss won't begin to diverge after decrease to a point. Here, I post the code to use Adam with learning rate decay using TensorFlow. Hope it is helpful to someone.
```
decayed_lr = tf.train.exponential_decay(learning_rate,
                                        global_step, 10000,
                                        0.95, staircase=True)
opt = tf.train.AdamOptimizer(decayed_lr, epsilon=adam_epsilon)
```
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...