Gradient descent is the workhorse of deep neural networks. has disadvantage slow convergence. The famous way to overcome convergence use momentum. Momentum effectively increases learning factor gradient descent. Recently, many approaches have been proposed control momentum for better optimization towards global minima, such as Adam, diffGrad, and AdaBelief. Adam decreases by dividing it with sq...