I understand what Gradient Descent does. Basically it tries to move towards the local optimal solution by slowly moving down the curve. I am trying to understand what is the act
Put simply, gradient descent you just take a small step towards where you think the zero is and then recalculate; Newton's method, you go all the way there.