Global Bandits with Holder Continuity

نویسندگان

  • Onur Atan
  • Cem Tekin
  • Mihaela van der Schaar
چکیده

Standard Multi-Armed Bandit (MAB) problems assume that the arms are independent. However, in many application scenarios, the information obtained by playing an arm provides information about the remainder of the arms. Hence, in such applications, this informativeness can and should be exploited to enable faster convergence to the optimal solution. In this paper, we introduce and formalize the Global MAB (GMAB), in which arms are globally informative through a global parameter, i.e., choosing an arm reveals information about all the arms. We propose a greedy policy for the GMAB which always selects the arm with the highest estimated expected reward, and prove that it achieves bounded parameter-dependent regret. Hence, this policy selects suboptimal arms only finitely many times, and after a finite number of initial time steps, the optimal arm is selected in all of the remaining time steps with probability one. In addition, we also study how the informativeness of the arms about each other’s rewards affects the speed of learning. Specifically, we prove that the parameter-free (worst-case) regret is sublinear in time, and decreases with the informativeness of the arms. We also prove a sublinear in time Bayesian risk bound for the GMAB which reduces to the well-known Bayesian risk bound for linearly parameterized bandits when the arms are fully informative. GMABs have applications ranging from drug and treatment discovery to dynamic pricing. Preliminary work. Under review by AISTATS 2014. Do not distribute.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hölder continuity of solution maps to a parametric weak vector equilibrium problem

In this paper, by using a new concept of strong convexity, we obtain sufficient conditions for Holder continuity of the solution mapping for a parametric weak vector equilibrium problem in the case where the solution mapping is a general set-valued one. Without strong monotonicity assumptions, the Holder continuity for solution maps to parametric weak vector optimization problems is discussed.

متن کامل

Global Multi-armed Bandits with Hölder Continuity

Standard Multi-Armed Bandit (MAB) problems assume that the arms are independent. However, in many application scenarios, the information obtained by playing an arm provides information about the remainder of the arms. Hence, in such applications, this informativeness can and should be exploited to enable faster convergence to the optimal solution. In this paper, formalize a new class of multi-a...

متن کامل

Asymptotic optimal control of multi-class restless bandits

We study the asymptotic optimal control of multi-class restless bandits. A restless bandit is acontrollable process whose state evolution depends on whether or not the bandit is made active. Theaim is to find a control that determines at each decision epoch which bandits to make active in orderto minimize the overall average cost associated to the states the bandits are in. Sinc...

متن کامل

Asymptotically optimal priority policies for indexable and non-indexable restless bandits

We study the asymptotic optimal control of multi-class restless bandits. A restless bandit isa controllable stochastic process whose state evolution depends on whether or not the bandit ismade active. Since finding the optimal control is typically intractable, we propose a class of prioritypolicies that are proved to be asymptotically optimal under a global attractor property an...

متن کامل

Log Holder Continuity of the Integrated Density of States for Stochastic Jacobi Matrices

We consider the integrated density of states, k(E)9 of a general operator on /2(Z ) of the form h = hQ + v, where (h0u)(n) = Σ ( + 0 ancl l ϋ = ι (vu)(n) = υ(n)u(n\ where v is a general bounded ergodic stationary process on Z. We show that \k(E) k(E'}\ g C[log(|£ E'\Y when \E ~ £'| ̂ i The key is a "Thouless formula for the strip."

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1410.7890  شماره 

صفحات  -

تاریخ انتشار 2014