Katyusha X: Practical Momentum Method for Stochastic Sum-of-Nonconvex Optimization

نویسنده

  • Zeyuan Allen-Zhu
چکیده

The problem of minimizing sum-of-nonconvex functions (i.e., convex functions that are average of non-convex ones) is becoming increasingly important in machine learning, and is the core machinery for PCA, SVD, regularized Newton’s method, accelerated non-convex optimization, and more. We show how to provably obtain an accelerated stochastic algorithm for minimizing sumof-nonconvex functions, by adding one additional line to the well-known SVRG method. This line corresponds to momentum, and shows how to directly apply momentum to the finite-sum stochastic minimization of sum-of-nonconvex functions. As a side result, our method enjoys linear parallel speed-up using mini-batch.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Katyusha: The First Truly Accelerated Stochastic Gradient Method

We introduce Katyusha, the first direct, primal-only stochastic gradient method that has a provably accelerated convergence rate in convex optimization. In contrast, previous methods are based on dual coordinate descent which are more restrictive, or based on outer-inner loops which make them “blind” to the underlying stochastic nature of the optimization process. Katyusha is the first algorith...

متن کامل

Fast Stochastic Variance Reduced Gradient Method with Momentum Acceleration for Machine Learning

Recently, research on accelerated stochastic gradient descentmethods (e.g., SVRG) has made exciting progress (e.g., lin-ear convergence for strongly convex problems). However,the best-known methods (e.g., Katyusha) requires at leasttwo auxiliary variables and two momentum parameters. Inthis paper, we propose a fast stochastic variance reductiongradient (FSVRG) method...

متن کامل

Third-order Smoothness Helps: Even Faster Stochastic Optimization Algorithms for Finding Local Minima

We propose stochastic optimization algorithms that can find local minima faster than existing algorithms for nonconvex optimization problems, by exploiting the third-order smoothness to escape non-degenerate saddle points more efficiently. More specifically, the proposed algorithm only needs Õ( −10/3) stochastic gradient evaluations to converge to an approximate local minimum x, which satisfies...

متن کامل

Toward Deeper Understanding of Nonconvex Stochastic Optimization with Momentum using Diffusion Approximations

Momentum Stochastic Gradient Descent (MSGD) algorithm has been widely applied to many nonconvex optimization problems in machine learning. Popular examples include training deep neural networks, dimensionality reduction, and etc. Due to the lack of convexity and the extra momentum term, the optimization theory of MSGD is still largely unknown. In this paper, we study this fundamental optimizati...

متن کامل

Proximal Stochastic Methods for Nonsmooth Nonconvex Finite-Sum Optimization

We analyze stochastic algorithms for optimizing nonconvex, nonsmooth finite-sum problems, where the nonsmooth part is convex. Surprisingly, unlike the smooth case, our knowledge of this fundamental problem is very limited. For example, it is not known whether the proximal stochastic gradient method with constant minibatch converges to a stationary point. To tackle this issue, we develop fast st...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1802.03866  شماره 

صفحات  -

تاریخ انتشار 2018