gradient-descent
http://ruder.io/optimizing-gradient-descent/ https://www.quora.com/Whats-the-difference-between-gradient-descent-and-stochastic-gradient-descent https://en.wikipedia.org/wiki/Stochastic_gradient_descent https://zh.coursera.org/learn/deep-neural-network/lecture/lBXu8/understanding-mini-batch-gradient-descent https://zh.coursera.org/learn/deep-neural-network/lecture/qcogH/mini-batch-gradient-descent https://am207.github.io/2017/wiki/gradientdescent.html http://leon.bottou.org/publications/pdf/online-1998.pdf References Sutton, R. S. (1986). Two problems with backpropagation and other steepest