问题
Okay, let me preface this by saying that I am well aware that this depends on MANY factors, I'm looking for some general guidelines from people with experience.
My goal is not to make a Neural Net that can compute squares of numbers for me, but I thought it would be a good experiment to see if I implemented the Backpropagation algorithm correctly. Does this seem like a good idea? Anyways, I am worried that I have not implemented the learning algorithm (fully) correctly.
My Testing (Results):
- Training Data: 500 randomly generated numbers between .001 and .999 using Java's Random
- Network Topology: 3 Layers with 1 input neuron, 5 hidden neurons, 1 output neuron
- Weights: All generated to random values between -1 and 1 (java.util.Random.nextDouble() * 2 - 1;)
- Uses a bias node: (numOfInputs + 1) so that the input[input.length -1] = 1
- Activation Function: Sigmoid
- Learning Rate: Shown in results code below
- Have not implemented any sort of momentum, etc
- Results:
Epochs: 10,000 Learning Rate .25 0.5 = [0.24203878039631344] 0.9 = [0.7942587190918747] 0.1 = [-0.005433286011774396] Changed learning rate to 0.3 0.5 = [0.2891542106869196] 0.9 = [0.8159817287374298] 0.1 = [-0.03614377685205278] Changed epoch to 1,000 with .25 learning rate 0.5 = [0.36399147315079117] 0.9 = [0.7585916275848852] 0.1 = [-0.02814488264341608] Kept epoch at 1,000 with .30 learning rate 0.5 = [0.3872669778857468] 0.9 = [0.8160049820236891] 0.1 = [-0.03328304871978338] Epochs: 100,000: .25 learning rate 0.5 = [0.24533230649123738] 0.9 = [0.8146287680498014] 0.1 = [0.006935561897963849] .30 learning rate 0.5 = [0.24660900415003595] 0.9 = [0.8097729997778165] 0.1 = [0.013269893700964097]
Are there any other 'simple' 'things' that I should try to train the network with to check its learning abilities?
回答1:
One of the simplest things you can do is calculating a XOR function. For testing "normal" multilayer perceptrons this is what I normally do. With a learning rate of 0.2 the XOR problem is solved perfectly (99% averaged accuracy) in less than 100 epochs with 2 - 5 - 1 neuron.
With a network (MLP) I have coded (tanh, no bias neuron but bias values for each neuron, weights initialized between 0.1 and 0.5, biases initialized with 0.5 each, 1.000 training data sets from 0.001 to 2.0 and activation normalization (input/activation of all but input layer neurons are divided by the amount of neurons in the parent layer), 1-5-1 neurons) I tried your problem and got a 95% averaged accuracy in less than 2.000 epochs every time with a learning rate of 0.1.
This can have several reasons. For my network 0.001 to 1.0 needed about twice the epochs to learn. Also the mentioned activation normalization (in most cases) reduces the needed epochs to learn a specific problem drastically.
In addition to that I had mostly positive experiences with bias values per neuron instead of one bias neuron per layer.
Furthermore if your learning rate is too high (and you do lots of epochs) you may risk running into overfitting.
回答2:
This is a bit of necroposting, but I thought it would be nice to know for people new to neural networks.
For benchmarking neural networks and machine learning models in general, one common choice is the MONK dataset and its related paper by Thrun, Fahlman et al, which you can download at
http://robots.stanford.edu/papers/thrun.MONK.html
It consists of a set of three easy classification problems, each one is solved with a different machine learning model.
If you look at the neural network chapters, you can see how the input was encoded, which hyperparameters were set (such as number of neurons or learning rate), and what the results were, so you can easily benchmark your own implementation from there.
I think it's a bit more robust than the XOR problem (I talk from experience, since when I was first implementing a neural network, my faulty implementation happened to solve the XOR problem but not the MONK problems).
来源:https://stackoverflow.com/questions/30688527/how-many-epochs-should-a-neural-net-need-to-learn-to-square-testing-results-in