问题
I am following trask's article to build a bare bone neural network in Python. Though he builds 1 layer Network (that maps 3 inputs to a single output) and a 2 layer network (that has 3 inputs, 4 neuron hidden layer and output layer with single neuron).
My task was to build a network that can approximate the function Y=X1+X2+X3. I provide the network with Y and it guesses values of x1, x2 and x3.
For this, I modified the above network. I tried to invert both above networks, i.e, tried to map a single input to 3 outputs. I have done this using Tensorflow API, but wish to implement it without such High Level APIs. My network looks like this:
This is how I am implementing it:
import numpy as np
# sigmoid function
def nonlin(x,deriv=False):
if(deriv==True): #Find gradient
return x*(1-x)
return 1/(1+np.exp(-x))
#Training Data
##OUTPUT
y=np.random.randint(1,255,size=(50,3)).astype(int) #dims:(m,3), m is training examples
##INPUT
X = np.sum(y, axis = 1, keepdims=True) #dims:(m,1)
#Weights for synapses
##between Input layer and hidden layer
syn0 = 2*np.random.random((1,4)) - 1
##between hidden layer Output layer
syn1 = 2*np.random.random((4,3)) - 1
#Training
for iter in range(100):
# forward propagation
l0 = X
l1 = nonlin(np.dot(l0,syn0))
l2 = nonlin(np.dot(l1,syn1))
# how much did we miss?
l2_error = y-l2
#Visualizing the error change
if (iter% 100) == 0:
print ("Error:" + str(np.mean(np.abs(l2_error))))
l2_delta = l2_error*nonlin(l2, deriv=True)
l1_error = l2_delta.dot(syn1.T)
# multiply how much we missed by the
# slope of the sigmoid at the values in l1
l1_delta = l1_error * nonlin(l1,True)
# update weights
syn1 += l1.T.dot(l2_delta)
syn0 += l0.T.dot(l1_delta)
But I notice that the weights are not updated as they should. They stop soon. I am unsure what the problem could be. I tried to test the network on a test value = 100.
test_case = np.array([[300]])
l1_out = nonlin(np.dot(test_case,syn0))
l2_out = nonlin(np.dot(l1_out,syn1))
print(l2_out)
But I get weird values of x1, x2, x3. Though I should get 100, 100 and 100 respectively.
What could be the problem such a simple implementation?
(P.S: I think its the issue with normalization. If yes, how should I implement it?)
来源:https://stackoverflow.com/questions/58426235/neural-network-unable-to-learn