Does bias in the convolutional layer really make a difference to the test accuracy?

微笑、不失礼 提交于 2019-12-04 12:22:17

Biases are tuned alongside weights by learning algorithms such as gradient descent. biases differ from weights is that they are independent of the output from previous layers. Conceptually bias is caused by input from a neuron with a fixed activation of 1, and so is updated by subtracting the just the product of the delta value and learning rate.

In a large model, removing the bias inputs makes very little difference because each node can make a bias node out of the average activation of all of its inputs, which by the law of large numbers will be roughly normal. At the first layer, the ability for this to happens depends on your input distribution. For MNIST for example, the input's average activation is roughly constant. On a small network, of course you need a bias input, but on a large network, removing it makes almost no difference.

See also:

Reference

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!