How to create a simple Gradient Descent algorithm

后端 未结 1 805
余生分开走
余生分开走 2021-02-06 09:14

I\'m studying simple machine learning algorithms, beginning with a simple gradient descent, but I\'ve got some trouble trying to implement it in python.

Here is the exa

1条回答
  •  渐次进展
    2021-02-06 09:52

    First issue is that running this with only one piece of data gives you an underdetermined system... this means it may have an infinite number of solutions. With three variables, you'd expect to have at least 3 data points, preferably much higher.

    Secondly using gradient descent where the step size is a scaled version of the gradient is not guaranteed to converge except in a small neighbourhood of the solution. You can fix that by switching to either a fixed size step in the direction of the negative gradient (slow) or a linesearch in the direction of the negative gradient ( faster, but slightly more complicated)

    So for fixed step size instead of

    theta0 = theta0 - step * dEdtheta0
    theta1 = theta1 - step * dEdtheta1
    theta2 = theta2 - step * dEdtheta2
    

    You do this

    n = max( [ dEdtheta1, dEdtheta1, dEdtheta2 ] )    
    theta0 = theta0 - step * dEdtheta0 / n
    theta1 = theta1 - step * dEdtheta1 / n
    theta2 = theta2 - step * dEdtheta2 / n
    

    It also looks like you may have a sign error in your steps.

    I'm also not sure that derror is a good stopping criteria. (But stopping criteria are notoriously hard to get "right")

    My final point is that gradient descent is horribly slow for parameter fitting. You probably want to use conjugate-gradient or Levenberg-Marquadt methods instead. I suspect that both of these methods already exist for python in the numpy or scipy packages (which aren't part of python by default but are pretty easy to install)

    0 讨论(0)
提交回复
热议问题