Tracking Slowly Moving Clairvoyant: Optimal Dynamic Regret of Online Learning with True and Noisy Gradient

نویسندگان

Tianbao Yang

Lijun Zhang

Rong Jin

Jinfeng Yi

چکیده

This work focuses on dynamic regret of online convex optimization that compares the performance of online learning to a clairvoyant who knows the sequence of loss functions in advance and hence selects the minimizer of the loss function at each step. By assuming that the clairvoyant moves slowly (i.e., the minimizers change slowly), we present several improved variationbased upper bounds of the dynamic regret under the true and noisy gradient feedback, which are optimal in light of the presented lower bounds. The key to our analysis is to explore a regularity metric that measures the temporal changes in the clairvoyant’s minimizers, to which we refer as path variation. Firstly, we present a general lower bound in terms of the path variation, and then show that under full information or gradient feedback we are able to achieve an optimal dynamic regret. Secondly, we present a lower bound with noisy gradient feedback and then show that we can achieve optimal dynamic regrets under a stochastic gradient feedback and two-point bandit feedback. Moreover, for a sequence of smooth loss functions that admit a small variation in the gradients, our dynamic regret under the two-point bandit feedback matches what is achieved with full information. Proceedings of the 33 rd International Conference on Machine Learning, New York, NY, USA, 2016. JMLR: W&CP volume 48. Copyright 2016 by the author(s).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Perishability of Data: Dynamic Pricing under Varying-Coefficient Models

We consider a firm that sells a large number of products to its customers in an online fashion. Each product is described by a high dimensional feature vector, and the market value of a product is assumed to be linear in the values of its features. Parameters of the valuation model are unknown and can change over time. The firm sequentially observes a product’s features and can use the historic...

متن کامل

Learning Rotations Learning rotations with little regret

We describe online algorithms for learning a rotation from pairs of unit vectors in R. We show that the expected regret of our online algorithm compared to the best fixed rotation chosen offline over T iterations is O( √ nT ). We also give a lower bound that proves that this expected regret bound is optimal within a constant factor. This resolves an open problem posed in COLT 2008. Our online a...

متن کامل

Dynamical Models and tracking regret in online convex programming

This paper describes a new online convex optimization method which incorporates a family of candidate dynamical models and establishes novel tracking regret bounds that scale with the comparator’s deviation from the best dynamical model in this family. Previous online optimization methods are designed to have a total accumulated loss comparable to that of the best comparator sequence, and exist...

متن کامل

Dynamic Pricing in High-dimensions

We study the pricing problem faced by a firm that sells a large number of products, described via a wide range of features, to customers that arrive over time. This is motivated in part by the prevalence of online marketplaces that allow for real-time pricing. We propose a dynamic policy, called Regularized Maximum Likelihood Pricing (RMLP), that obtains asymptotically optimal revenue. Our poli...

متن کامل

Accelerated Gradient Methods for Stochastic Optimization and Online Learning

Regularized risk minimization often involves non-smooth optimization, either because of the loss function (e.g., hinge loss) or the regularizer (e.g., l1-regularizer). Gradient methods, though highly scalable and easy to implement, are known to converge slowly. In this paper, we develop a novel accelerated gradient method for stochastic optimization while still preserving their computational si...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Tracking Slowly Moving Clairvoyant: Optimal Dynamic Regret of Online Learning with True and Noisy Gradient

نویسندگان

چکیده

منابع مشابه

Perishability of Data: Dynamic Pricing under Varying-Coefficient Models

Learning Rotations Learning rotations with little regret

Dynamical Models and tracking regret in online convex programming

Dynamic Pricing in High-dimensions

Accelerated Gradient Methods for Stochastic Optimization and Online Learning

عنوان ژورنال:

اشتراک گذاری