Training a neural network to add

后端未结

关注

 5  1201

不要未来只要你来 2021-02-02 11:01

I need to train a network to multiply or add 2 inputs, but it doesn\'t seem to approximate well for all points after 20000 iterations. More specifically, I train it on the whole

5条回答

既然无缘 (楼主)

2021-02-02 11:35

If you want to keep things neural (links have weights, the neuron calculates the the ponderated sum of the inputs by the weights and answers 0 or 1 depending on the sigmoid of the sum and you use backpropagation of the gradient), then you should think about a neuron of the hidden layer as classifiers. They define a line that separates the input space in to classes: 1 class corresponds to the part where the neuron responds 1, the other when it responds 0. A second neuron of the hidden layer will define another separation and so forth. The output neuron combines the outputs of the hidden layer by adapting its weights for its output to correspond to the ones you presented during learning.
Hence, a single neuron will classify the input space in 2 classes (maybe corresponding to a addition depending on the learning database). Two neurons will be able to define 4 classes. Three neurons 8 classes, etc. Think of the output of the hidden neurons as powers of 2: h1*2^0 + h2*2^1+...+hn*2^n, where hi is the output of hidden neuron i. NB: you will need n output neurons. This answers the question about the number of hidden neurons to use.
But the NN doesn't compute the addition. It sees it as a classification problem based on what it learned. It will never be able to generate a correct answer for values that are out of its learning base. During the learning phase, it adjusts the weights in order to place the separators (lines in 2D) so as to produce the correct answer. If your inputs are in [0,10], it will learn to produce to correct answers for additions of values in [0,10]^2 but will never give a good answer for 12 + 11.
If your last values are well learned and the first forgotten, try to lower the learning rate: the modifications of the weights (depending on the gradient) of the last examples may override the first one (if you're using stochastic backprop). Be sure that your learning base is fair. You can also present the badly learned examples more often. And try many values of the learning rate until you find a good one.

0 讨论(0)

查看其它5个回答
发布评论:

提交评论
- 加载中...