I am using nnet function in R to train my neural network. I am not getting what is decay parameter in nnet is? Is this step size to be used in gradient descent mentod or regular
Complementing blahdiblah's answer by looking at the source code I think that parameter weights
corresponds to the learning rate of back-propagation (by reading the manual I couldn't understand what it was). Look at the file nnet.c, line 236, inside function fpass :
TotalError += wx * E(Outputs[i], goal[i - FirstOutput]);
here, in a very intuitive nomenclature, E
corresponds to the bp error and wx
is a parameter passed to the function, which eventually corresponds to the identifier Weights[i]
.
Also you can be sure that the parameter decay
is indeed what it claims to be by going to the lines 317~319 of the same file, inside function VR_dfunc :
for (i = 0; i < Nweights; i++)
sum1 += Decay[i] * p[i] * p[i];
*fp = TotalError + sum1;
where p
corresponds to the connections' weights, which is the exact definition of the weight-decay regularization.