Learning rate selection in stochastic gradient methods based on line search strategies

نویسندگان

چکیده

Finite-sum problems appear as the sample average approximation of a stochastic optimization problem and often arise in machine learning applications with large scale data sets. A very popular approach to face finite-sum is gradient method. It well known that proper strategy select hyperparameters this method (i.e. set a-priori selected parameters) and, particular, rate, needed guarantee convergence properties good practical performance. In paper, we analyse standard line search based updating rules fix rate sequence, also relation size mini batch chosen compute current gradient. An extensive numerical experimentation carried out order evaluate effectiveness discussed strategies for convex non-convex test problems, highlighting methods avoid expensive initial setting hyperparameters. The approaches have been applied train Convolutional Neural Network, providing promising results.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Rate Adaptation in Stochastic Gradient Descent

The efficient supervised training of artificial neural networks is commonly viewed as the minimization of an error function that depends on the weights of the network. This perspective gives some advantage to the development of effective training algorithms, because the problem of minimizing a function is well known in the field of numerical analysis. Typically, deterministic minimization metho...

متن کامل

the role of vocabulary learning strategies on vocabulary retention and on language proficiency in iranian efl students

آموزش زبان دوم طی سالهای اخیر بدنبال روشهای بهتری برای تحقق بخشیدن به اهداف معلمین و دانش آموزان بوده است . در مورد معلمین این امر منجر به تحقیقاتی در مورد ساختار زبانی، محاوره ای و تعاملی گردیده است . در مورد دانش آموزان این امر به مطالعاتی درباره نگرش دانش آموزان نسبت به فراگیری در داخل کلاس و بیرون از آن و همچنین انواع مختلف روشهای پردازش ذهنی منجر شده است . هدف از این تحقیق یافتن روشهائی اس...

15 صفحه اول

Chapter 2 LEARNING RATE ADAPTATION IN STOCHASTIC GRADIENT DESCENT

The efficient supervised training of artificial neural networks is commonly viewed as the minimization of an error function that depends on the weights of the network. This perspective gives some advantage to the development of effective training algorithms, because the problem of minimizing a function is well known in the field of numerical analysis. Typically, deterministic minimization metho...

متن کامل

Global Convergence of Conjugate Gradient Methods without Line Search

Global convergence results are derived for well-known conjugate gradient methods in which the line search step is replaced by a step whose length is determined by a formula. The results include the following cases: 1. The Fletcher-Reeves method, the Hestenes-Stiefel method, and the Dai-Yuan method applied to a strongly convex LC objective function; 2. The Polak-Ribière method and the Conjugate ...

متن کامل

Nonlinear Conjugate Gradient Methods with Wolfe Type Line Search

and Applied Analysis 3 = ‖d k−1 ‖2 ‖g k−1 ‖4 + 1 ‖g k ‖2 − β2 k (gT k d k−1 ) 2 /‖g k ‖4

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Applied mathematics in science and engineering

سال: 2023

ISSN: ['2769-0911']

DOI: https://doi.org/10.1080/27690911.2022.2164000