What does it mean to “break symmetry”? in the context of neural network programming? [duplicate]

问题

I have heard a lot about "breaking the symmetry" within the context of neural network programming and initialization. Can somebody please explain what this means? As far as I can tell, it has something to do with neurons performing similarly during forward and backward propagation if the weight matrix is filled with identical values during initialization. Asymmetrical behavior would be more clearly replicated with random initialization, i.e., not using identical values throughout the matrix.

回答1:

Your understanding is correct.

When all initial values are identical, for example initialize every weight to 0, then when doing backpropagation, all weights will get the same gradient, and hence the same update. This is what is referred to as the symmetry.

Intuitively, that means all nodes will learn the same thing, and we don't want that, because we want the network to learn different kinds of features. This is achieved by random initialization, since then the gradient will be different, and each node will grow to be more distinct to other nodes, enabling diverse feature extraction. This is what is referred to as breaking the symmetry.

来源：https://stackoverflow.com/questions/59638646/what-does-it-mean-to-break-symmetry-in-the-context-of-neural-network-programm

标签

neural-network

computer-science

distribution

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!