1、SGD (On the importance of initialization and momentum in deep learning)
3、Nesterov accelerated gradient
4、Adagrad (Adaptive Subgradient Methods for online learning and stochastic optimization)
5、RMSprop (Genderating Sequences with recurrent neural networks)
6、Rprop (resilient backpropagation algorithm)
7、Adadelta (Adadelta: an adaptive learning rate method)
8、Adam (A method for stochastic optimization)
9、AMSGrad (On the convergence of Adam and Beyond)
10、AdaBound (Adaptive gradient methods with dynamic bound of learning rate)