An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback
نویسنده
چکیده
We consider the closely related problems of bandit convex optimization with two-point feedback, and zero-order stochastic convex optimization with two function evaluations per round. We provide a simple algorithm and analysis which is optimal for convex Lipschitz functions. This improves on Duchi et al. (2015), which only provides an optimal result for smooth functions; Moreover, the algorithm and analysis are simpler, and readily extend to non-Euclidean problems. The algorithm is based on a small but surprisingly powerful modification of the gradient estimator.
منابع مشابه
Stochastic convex optimization with bandit feedback
This paper addresses the problem of minimizing a convex, Lipschitz function f over a convex, compact set X under a stochastic bandit feedback model. In this model, the algorithm is allowed to observe noisy realizations of the function value f(x) at any query point x ∈ X . The quantity of interest is the regret of the algorithm, which is the sum of the function values at algorithm’s query points...
متن کاملRegret Analysis for Continuous Dueling Bandit
The dueling bandit is a learning framework wherein the feedback information in the learning process is restricted to a noisy comparison between a pair of actions. In this research, we address a dueling bandit problem based on a cost function over a continuous space. We propose a stochastic mirror descent algorithm and show that the algorithm achieves an O( √ T log T )-regret bound under strong ...
متن کاملAn optimal algorithm for bandit convex optimization
We consider the problem of online convex optimization against an arbitrary adversary with bandit feedback, known as bandit convex optimization. We give the first Õ( √ T )-regret algorithm for this setting based on a novel application of the ellipsoid method to online learning. This bound is known to be tight up to logarithmic factors. Our analysis introduces new tools in discrete convex geometry.
متن کاملParticle Swarm Optimization with Smart Inertia Factor for Combined Heat and Power Economic Dispatch
In this paper particle swarm optimization with smart inertia factor (PSO-SIF) algorithm is proposed to solve combined heat and power economic dispatch (CHPED) problem. The CHPED problem is one of the most important problems in power systems and is a challenging non-convex and non-linear optimization problem. The aim of solving CHPED problem is to determine optimal heat and power of generating u...
متن کاملOptimal Algorithms for Online Convex Optimization with Multi-Point Bandit Feedback
Bandit convex optimization is a special case of online convex optimization with partial information. In this setting, a player attempts to minimize a sequence of adversarially generated convex loss functions, while only observing the value of each function at a single point. In some cases, the minimax regret of these problems is known to be strictly worse than the minimax regret in the correspo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of Machine Learning Research
دوره 18 شماره
صفحات -
تاریخ انتشار 2017