Adaptive Moment Estimation Adam
Adam stands for Adaptive Moment Estimation, which is an algorithm for first-order gradient optimization, especially suitable for dealing with large-scale data and parameter optimization problems. It was proposed by Diederik P. Kingma and Jimmy Ba in 2014, and published in the 2015 ICLR conference.Adam: A Method for Stochastic Optimization".
The Adam algorithm is a first-order gradient-based optimization algorithm for stochastic objective functions based on adaptive estimation of low-order moments. The method is straightforward to implement, computationally efficient, has low memory requirements, is invariant to diagonal scaling of gradients, and is well suited for problems with large amounts of data and/or parameters. The method also works well for non-static objectives and problems with very noisy and/or sparse gradients. Hyperparameters have an intuitive interpretation and usually do not require much tuning.