Combining Conjugate Direction Methods with Stochastic Approximation of Gradients
نویسندگان
چکیده
The method of conjugate directions provides a very effective way to optimize large, deterministic systems by gradient descent. In its standard form, however, it is not amenable to stochastic approximation of the gradient. Here we explore ideas from conjugate gradient in the stochastic (online) setting, using fast Hessian-gradient products to set up low-dimensional Krylov subspaces within individual mini-batches. In our benchmark experiments the resulting online learning algorithms converge orders of magnitude faster than ordinary stochastic gradient descent.
منابع مشابه
A new Levenberg-Marquardt approach based on Conjugate gradient structure for solving absolute value equations
In this paper, we present a new approach for solving absolute value equation (AVE) whichuse Levenberg-Marquardt method with conjugate subgradient structure. In conjugate subgradientmethods the new direction obtain by combining steepest descent direction and the previous di-rection which may not lead to good numerical results. Therefore, we replace the steepest descentdir...
متن کاملConjugate gradient neural network in prediction of clay behavior and parameters sensitivities
The use of artificial neural networks has increased in many areas of engineering. In particular, this method has been applied to many geotechnical engineering problems and demonstrated some degree of success. A review of the literature reveals that it has been used successfully in modeling soil behavior, site characterization, earth retaining structures, settlement of structures, slope stabilit...
متن کاملTowards Stochastic Conjugate Gradient Methods
The method of conjugate gradients provides a very effective way to optimize large, deterministic systems by gradient descent. In its standard form, however, it is not amenable to stochastic approximation of the gradient. Here we explore a number of ways to adopt ideas from conjugate gradient in the stochastic setting, using fast Hessian-vector products to obtain curvature information cheaply. I...
متن کاملConjugate Directions for Stochastic Gradient Descent
The method of conjugate gradients provides a very effective way to optimize large, deterministic systems by gradient descent. In its standard form, however, it is not amenable to stochastic approximation of the gradient. Here we explore ideas from conjugate gradient in the stochastic (online) setting, using fast Hessian-gradient products to set up low-dimensional Krylov subspaces within individ...
متن کاملAPPROXIMATION OF STOCHASTIC PARABOLIC DIFFERENTIAL EQUATIONS WITH TWO DIFFERENT FINITE DIFFERENCE SCHEMES
We focus on the use of two stable and accurate explicit finite difference schemes in order to approximate the solution of stochastic partial differential equations of It¨o type, in particular, parabolic equations. The main properties of these deterministic difference methods, i.e., convergence, consistency, and stability, are separately developed for the stochastic cases.
متن کامل