Gradient Descent has a problem of Local Minima. We need run gradient descent exponential times for to find global minima.
Can anybody tell me about any alternatives of
It has been demonstrated that being stuck in a local minima is very unlikely to occur in a high dimensional space because having all derivatives equals to zero in every dimensions is unlikely. (Source Andrew NG Coursera DeepLearning Specialization) That also explain why gradient descent works so well.