Linear Contextual Bandits with Global Constraints and Objective

نویسندگان

Shipra Agrawal

Nikhil R. Devanur

چکیده

We consider the linear contextual bandit problem with global convex constraints and a concaveobjective function. In each round, the outcome of pulling an arm is a vector, that depends linearly onthe context of that arm. The global constraints require the average of these vectors to lie in a certainconvex set. The objective is a concave function of this average vector. This problem turns out to bea common generalization of classic linear contextual bandits (linContextual) [8, 17, 1], bandits withconcave rewards and convex knapsacks (BwCR) [4], and the online stochastic convex programming(OSCP) problem [5]. We present algorithms with near-optimal regret bounds for this problem. Ourbounds compare favorably to results on the unstructured version of the problem [6, 12] where therelation between the contexts and the outcomes could be arbitrary, but the algorithm only competesagainst a fixed set of policies. We combine techniques from the work on linContextual, BwCR andOSCP in a nontrivial manner while also tackling new difficulties that are not present in any of thesespecial cases. Microsoft Research. [email protected] Research. [email protected].

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Linear Contextual Bandits with Knapsacks

We consider the linear contextual bandit problem with resource consumption, in addition to reward generation. In each round, the outcome of pulling an arm is a reward as well as a vector of resource consumptions. The expected values of these outcomes depend linearly on the context of that arm. The budget/capacity constraints require that the total consumption doesn’t exceed the budget for each ...

متن کامل

Resourceful Contextual Bandits

We study contextual bandits with ancillary constraints on resources, which are common in realworld applications such as choosing ads or dynamic pricing of items. We design the first algorithm for solving these problems that improves over a trivial reduction to the non-contextual case. We consider very general settings for both contextual bandits (arbitrary policy sets, Dudik et al. (2011)) and ...

متن کامل

Contextual Bandits with Global Constraints and Objective

We consider the contextual version of a multi-armed bandit problem with global convex constraints and concave objective function. In each round, the outcome of pulling an arm is a context-dependent vector, and the global constraints require the average of these vectors to lie in a certain convex set. The objective is a concave function of this average vector. The learning agent competes with an...

متن کامل

A Survey on Contextual Multi-armed Bandits

4 Stochastic Contextual Bandits 6 4.1 Stochastic Contextual Bandits with Linear Realizability Assumption . . . . 6 4.1.1 LinUCB/SupLinUCB . . . . . . . . . . . . . . . . . . . . . . . . . . 6 4.1.2 LinREL/SupLinREL . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 4.1.3 CofineUCB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 4.1.4 Thompson Sampling with Linear Payoffs...

متن کامل

Provably Optimal Algorithms for Generalized Linear Contextual Bandits

Contextual bandits are widely used in Internet services from news recommendation to advertising, and to Web search. Generalized linear models (logistical regression in particular) have demonstrated stronger performance than linear models in many applications where rewards are binary. However, most theoretical analyses on contextual bandits so far are on linear bandits. In this work, we propose ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1507.06738 شماره

صفحات -

تاریخ انتشار 2015

Linear Contextual Bandits with Global Constraints and Objective

نویسندگان

چکیده

منابع مشابه

Linear Contextual Bandits with Knapsacks

Resourceful Contextual Bandits

Contextual Bandits with Global Constraints and Objective

A Survey on Contextual Multi-armed Bandits

Provably Optimal Algorithms for Generalized Linear Contextual Bandits

عنوان ژورنال:

اشتراک گذاری