Neural Activation Functions - Difference between Logistic / Tanh / etc

后端未结

关注

 4  1185

广开言路 2021-02-08 09:50

I\'m writing some basic neural network methods - specifically the activation functions - and have hit the limits of my rubbish knowledge of math. I understand the respective ran

4条回答

长发绾君心 (楼主)

2021-02-08 10:14
The word is (and I've tested) that in some cases it might be better to use the tanh than the logistic since
1. Outputs near Y = 0 on the logistic times a weight w yields a value near 0 which doesn't have much effect on the upper layers which it affects (although absence also affects), however a value near Y = -1 on tahn times a weight w might yield a large number which has more numeric effect.
2. The derivative of tanh (1 - y^2) yields values greater than the logistic (y (1 -y) = y - y^2). For example, when z = 0, the logistic function yields y = 0.5 and y' = 0.25, for tanh y = 0 but y' = 1 (you can see this in general just by looking at the graph). MEANING that a tanh layer might learn faster than a logistic layer because of the magnitude of the gradient.
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...