On the Convergence of a Family of Robust Losses for Stochastic Gradient Descent

نویسندگان

Bo Han

Ivor W. Tsang

Ling Chen

چکیده

The convergence of Stochastic Gradient Descent (SGD) using convex loss functions has been widely studied. However, vanilla SGD methods using convex losses cannot perform well with noisy labels, which adversely affect the update of the primal variable in SGD methods. Unfortunately, noisy labels are ubiquitous in real world applications such as crowdsourcing. To handle noisy labels, in this paper, we present a family of robust losses for SGD methods. By employing our robust losses, SGD methods successfully reduce negative effects caused by noisy labels on each update of the primal variable. We not only reveal that the convergence rate is O(1/T ) for SGD methods using robust losses, but also provide the robustness analysis on two representative robust losses. Comprehensive experimental results on six real-world datasets show that SGD methods using robust losses are obviously more robust than other baseline methods in most situations with fast convergence.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identification of Multiple Input-multiple Output Non-linear System Cement Rotary Kiln using Stochastic Gradient-based Rough-neural Network

Because of the existing interactions among the variables of a multiple input-multiple output (MIMO) nonlinear system, its identification is a difficult task, particularly in the presence of uncertainties. Cement rotary kiln (CRK) is a MIMO nonlinear system in the cement factory with a complicated mechanism and uncertain disturbances. The identification of CRK is very important for different pur...

متن کامل

Averaging Stochastic Gradient Descent on Riemannian Manifolds

We consider the minimization of a function defined on a Riemannian manifold M accessible only through unbiased estimates of its gradients. We develop a geometric framework to transform a sequence of slowly converging iterates generated from stochastic gradient descent (SGD) on M to an averaged iterate sequence with a robust and fast O(1/n) convergence rate. We then present an application of our...

متن کامل

Wavelet Based Estimation of the Derivatives of a Density for a Discrete-Time Stochastic Process: Lp-Losses

We propose a method of estimation of the derivatives of probability density based on wavelets methods for a sequence of random variables with a common one-dimensional probability density function and obtain an upper bound on Lp-losses for such estimators. We suppose that the process is strongly mixing and we show that the rate of convergence essentially depends on the behavior of a special quad...

متن کامل

Minimizing Calibrated Loss using Stochastic Low-Rank Newton Descent for large scale image classification

A standard approach for large scale image classification involves high dimensional features and Stochastic Gradient Descent algorithm (SGD) for the minimization of classical Hinge Loss in the primal space. Although complexity of Stochastic Gradient Descent is linear with the number of samples these method suffers from slow convergence. In order to cope with this issue, we propose here a Stochas...

متن کامل

Stochastic Coordinate Descent Methods for Regularized Smooth and Nonsmooth Losses

Stochastic Coordinate Descent (SCD) methods are among the first optimization schemes suggested for efficiently solving large scale problems. However, until now, there exists a gap between the convergence rate analysis and practical SCD algorithms for general smooth losses and there is no primal SCD algorithm for nonsmooth losses. In this paper, we discuss these issues using the recently develop...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

On the Convergence of a Family of Robust Losses for Stochastic Gradient Descent

نویسندگان

چکیده

منابع مشابه

Identification of Multiple Input-multiple Output Non-linear System Cement Rotary Kiln using Stochastic Gradient-based Rough-neural Network

Averaging Stochastic Gradient Descent on Riemannian Manifolds

Wavelet Based Estimation of the Derivatives of a Density for a Discrete-Time Stochastic Process: Lp-Losses

Minimizing Calibrated Loss using Stochastic Low-Rank Newton Descent for large scale image classification

Stochastic Coordinate Descent Methods for Regularized Smooth and Nonsmooth Losses

عنوان ژورنال:

اشتراک گذاری