Neural Network unable to learn

丶灬走出姿态 提交于 2020-06-27 04:14:19

问题


I am following trask's article to build a bare bone neural network in Python. Though he builds 1 layer Network (that maps 3 inputs to a single output) and a 2 layer network (that has 3 inputs, 4 neuron hidden layer and output layer with single neuron).

My task was to build a network that can approximate the function Y=X1+X2+X3. I provide the network with Y and it guesses values of x1, x2 and x3.

For this, I modified the above network. I tried to invert both above networks, i.e, tried to map a single input to 3 outputs. I have done this using Tensorflow API, but wish to implement it without such High Level APIs. My network looks like this:

This is how I am implementing it:

import numpy as np

# sigmoid function 
def nonlin(x,deriv=False):
    if(deriv==True):    #Find gradient
        return x*(1-x)
    return 1/(1+np.exp(-x))

#Training Data
##OUTPUT 
y=np.random.randint(1,255,size=(50,3)).astype(int)   #dims:(m,3), m is training examples
##INPUT
X = np.sum(y, axis = 1, keepdims=True)  #dims:(m,1)

#Weights for synapses 
##between Input layer and hidden layer
syn0 = 2*np.random.random((1,4)) - 1
##between hidden layer Output layer 
syn1 = 2*np.random.random((4,3)) - 1

#Training
for iter in range(100):

    # forward propagation
    l0 = X
    l1 = nonlin(np.dot(l0,syn0))
    l2 = nonlin(np.dot(l1,syn1))

    # how much did we miss?
    l2_error = y-l2
    #Visualizing the error change
    if (iter% 100) == 0:
        print ("Error:" + str(np.mean(np.abs(l2_error))))
    
    l2_delta = l2_error*nonlin(l2, deriv=True)
    
    l1_error = l2_delta.dot(syn1.T)
    
    # multiply how much we missed by the 
    # slope of the sigmoid at the values in l1
    l1_delta = l1_error * nonlin(l1,True)

    # update weights
    syn1 += l1.T.dot(l2_delta)
    syn0 += l0.T.dot(l1_delta)

But I notice that the weights are not updated as they should. They stop soon. I am unsure what the problem could be. I tried to test the network on a test value = 100.

test_case = np.array([[300]])
l1_out = nonlin(np.dot(test_case,syn0))
l2_out = nonlin(np.dot(l1_out,syn1))
print(l2_out)

But I get weird values of x1, x2, x3. Though I should get 100, 100 and 100 respectively.

What could be the problem such a simple implementation?

(P.S: I think its the issue with normalization. If yes, how should I implement it?)

来源:https://stackoverflow.com/questions/58426235/neural-network-unable-to-learn

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!