Implementing a perceptron with backpropagation algorithm

前端 未结 1 2054
野的像风
野的像风 2021-02-10 01:54

I am trying to implement a two-layer perceptron with backpropagation to solve the parity problem. The network has 4 binary inputs, 4 hidden units in the first layer and 1 outpu

1条回答
  •  囚心锁ツ
    2021-02-10 02:53

    I think I spotted the problem; funny enough, what I found is visible in your high-level description, but I only found what looked odd in the code. First, the description:

    for each hidden neuron h connected to the output layer
    h.weight connecting h to output = learningRate * outputDelta * h.value
    
    for each input neuron x connected to the hidden layer
    x.weight connecting x to h[i] = learningRate * hiddenDelta[i] * x.value
    

    I believe the h.weight should be updated with respect to the previous weight. Your update mechanism sets it based only on the learning rate, the output delta, and the value of the node. Similarly, the x.weight is also being set based on the learning rate, the hidden delta, and the value of the node:

        /*** Weight updates ***/
    
        // update weights connecting hidden neurons to output layer
        for (i = 0; i < output.size(); i++) {
            for (Neuron h : output.get(i).left) {
                h.weights[i] = learningRate * outputDelta[i] * h.value;
            }
        }
    
        // update weights connecting input neurons to hidden layer
        for (i = 0; i < hidden.size(); i++) {
            for (Neuron x : hidden.get(i).left) {
                x.weights[i] = learningRate * hiddenDelta[i] * x.value;
            }
        }
    

    I do not know what the correct solution is; but I have two suggestions:

    1. Replace these lines:

              h.weights[i] = learningRate * outputDelta[i] * h.value;
              x.weights[i] = learningRate * hiddenDelta[i] * x.value;
      

      with these lines:

              h.weights[i] += learningRate * outputDelta[i] * h.value;
              x.weights[i] += learningRate * hiddenDelta[i] * x.value;
      

      (+= instead of =.)

    2. Replace these lines:

              h.weights[i] = learningRate * outputDelta[i] * h.value;
              x.weights[i] = learningRate * hiddenDelta[i] * x.value;
      

      with these lines:

              h.weights[i] *= learningRate * outputDelta[i];
              x.weights[i] *= learningRate * hiddenDelta[i];
      

      (Ignore the value and simply scale the existing weight. The learning rate should be 1.05 instead of .05 for this change.)

    0 讨论(0)
提交回复
热议问题