Multilayer Perceptron replaced with Single Layer Perceptron

断了今生、忘了曾经 提交于 2020-01-02 23:14:49

问题


I got a problem in understending the difference between MLP and SLP.

I know that in the first case the MLP has more than one layer (the hidden layers) and that the neurons got a non linear activation function, like the logistic function (needed for the gradient descent). But I have read that:

"if all neurons in an MLP had a linear activation function, the MLP could be replaced by a single layer of perceptrons, which can only solve linearly separable problems"

I don't understand why in the specific case of the XOR, which is not linearly separable, the equivalent MLP is a two layer network, that for every neurons got a linear activation function, like the step function. I understand that I need two line for the separation, but in this case I cannot apply the rule of the previous statment (the replacement of the MLP with the SLP).

Mlp for xor:

http://s17.postimg.org/c7hwv0s8f/xor.png

In the linked image the neurons A B and C have a linear activation function (like the step function)

Xor: http://s17.postimg.org/n77pkd81b/xor1.png


回答1:


A linear function is f(x) = a x + b. If we take another linear function g(z) = c z + d, and apply g(f(x)) (which would be the equivalent of feeding the output of one linear layer as the input to the next linear layer) we get g(f(x)) = c (a x + b) + d = ac x + cb + d = (ac) x + (cb + d) which is in itself another linear function.

The step function is not a linear function - You cannot write it as a x + b. That's why a MLP using a step function is strictly more expressive than a single layer perceptron using a step function.



来源:https://stackoverflow.com/questions/30559405/multilayer-perceptron-replaced-with-single-layer-perceptron

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!