Bandit Convex Optimization: Towards Tight Bounds

نویسندگان

Elad Hazan

Kfir Y. Levy

چکیده

Bandit Convex Optimization (BCO) is a fundamental framework for decision making under uncertainty, which generalizes many problems from the realm of online and statistical learning. While the special case of linear cost functions is well understood, a gap on the attainable regret for BCO with nonlinear losses remains an important open question. In this paper we take a step towards understanding the best attainable regret bounds for BCO: we give an efficient and near-optimal regret algorithm for BCO with strongly-convex and smooth loss functions. In contrast to previous works on BCO that use time invariant exploration schemes, our method employs an exploration scheme that shrinks with time.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Nearly Tight Bounds for the Continuum-Armed Bandit Problem

In the multi-armed bandit problem, an online algorithm must choose from a set of strategies in a sequence of n trials so as to minimize the total cost of the chosen strategies. While nearly tight upper and lower bounds are known in the case when the strategy set is finite, much less is known when there is an infinite strategy set. Here we consider the case when the set of strategies is a subset...

متن کامل

An optimal algorithm for bandit convex optimization

We consider the problem of online convex optimization against an arbitrary adversary with bandit feedback, known as bandit convex optimization. We give the first Õ( √ T )-regret algorithm for this setting based on a novel application of the ellipsoid method to online learning. This bound is known to be tight up to logarithmic factors. Our analysis introduces new tools in discrete convex geometry.

متن کامل

Bandit Smooth Convex Optimization: Improving the Bias-Variance Tradeoff

Bandit convex optimization is one of the fundamental problems in the field of online learning. The best algorithm for the general bandit convex optimization problem guarantees a regret of e O(T 5/6), while the best known lower bound is ⌦(T 1/2). Many attempts have been made to bridge the huge gap between these bounds. A particularly interesting special case of this problem assumes that the loss...

متن کامل

Optimal Algorithms for Online Convex Optimization with Multi-Point Bandit Feedback

Bandit convex optimization is a special case of online convex optimization with partial information. In this setting, a player attempts to minimize a sequence of adversarially generated convex loss functions, while only observing the value of each function at a single point. In some cases, the minimax regret of these problems is known to be strictly worse than the minimax regret in the correspo...

متن کامل

On the Complexity of Bandit and Derivative-Free Stochastic Convex Optimization

The problem of stochastic convex optimization with bandit feedback (in the learning community) or without knowledge of gradients (in the optimization community) has received much attention in recent years, in the form of algorithms and performance upper bounds. However, much less is known about the inherent complexity of these problems, and there are few lower bounds in the literature, especial...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Bandit Convex Optimization: Towards Tight Bounds

نویسندگان

چکیده

منابع مشابه

Nearly Tight Bounds for the Continuum-Armed Bandit Problem

An optimal algorithm for bandit convex optimization

Bandit Smooth Convex Optimization: Improving the Bias-Variance Tradeoff

Optimal Algorithms for Online Convex Optimization with Multi-Point Bandit Feedback

On the Complexity of Bandit and Derivative-Free Stochastic Convex Optimization

عنوان ژورنال:

اشتراک گذاری