I\'m trying to write out a bit of code for the gradient descent algorithm explained in the Stanford Machine Learning lecture (lecture 2 at around 25:00). Below is the implementa
When your cost function increases or cycles up and down, you usually have too large a value for alpha
. What alpha
are you using?
Start out with an alpha = 0.001
and see if that converges? If not try various alphas
(0.003, 0.01, 0.03, 0.1, 0.3, 1)
and find one that converges quickly.
Scaling the data (normalization) won't help you with only 1 feature (your theta[1]
) as normalization only applies to 2+
features (multivariate linear regression).
Also bear in mind that for a small number of features you can use the Normal Equation to get the correct answer.