Neural Activation Functions - Difference between Logistic / Tanh / etc

后端 未结 4 1185
广开言路
广开言路 2021-02-08 09:50

I\'m writing some basic neural network methods - specifically the activation functions - and have hit the limits of my rubbish knowledge of math. I understand the respective ran

4条回答
  •  长发绾君心
    2021-02-08 10:14

    The word is (and I've tested) that in some cases it might be better to use the tanh than the logistic since

    1. Outputs near Y = 0 on the logistic times a weight w yields a value near 0 which doesn't have much effect on the upper layers which it affects (although absence also affects), however a value near Y = -1 on tahn times a weight w might yield a large number which has more numeric effect.
    2. The derivative of tanh (1 - y^2) yields values greater than the logistic (y (1 -y) = y - y^2). For example, when z = 0, the logistic function yields y = 0.5 and y' = 0.25, for tanh y = 0 but y' = 1 (you can see this in general just by looking at the graph). MEANING that a tanh layer might learn faster than a logistic layer because of the magnitude of the gradient.

提交回复
热议问题