Nonparametric Bayesian multiarmed bandits for single-cell experiment design
نویسندگان
چکیده
منابع مشابه
PAC-Bayesian Analysis of Martingales and Multiarmed Bandits
We present two alternative ways to apply PAC-Bayesian analysis to sequences of dependent random variables. The first is based on a new lemma that enables to bound expectations of convex functions of certain dependent random variables by expectations of the same functions of independent Bernoulli random variables. This lemma provides an alternative tool to Hoeffding-Azuma inequality to bound con...
متن کاملMultiarmed Bandits With Limited Expert Advice
We consider the problem of minimizing regret in the setting of advice-efficient multiarmed bandits with expert advice. We give an algorithm for the setting of K arms and N experts out of which we are allowed to query and use only M experts’ advice in each round, which has a regret bound of Õ (√ min{K,M}N M T ) after T rounds. We also prove that any algorithm for this problem must have expected ...
متن کاملMultiarmed Bandits in the Worst Case
We present a survey of results on a recently formulated variant of the classical (stochastic) multiarmed bandit problem in which no assumption is made on the mechanism generating the rewards. We describe randomized allocation policies for this variant and prove bounds on their regret as a function of the time horizon and the number of arms. These bounds hold for any assignment of rewards to the...
متن کاملMultitasking, Multiarmed Bandits, and the Italian Judiciary
We model how a judge schedules cases as a multi-armed bandit problem. The model indicates that a first-in-first-out (FIFO) scheduling policy is optimal when the case completion hazard rate function is monotonic. But there are two ways to implement FIFO in this context: at the hearing level or at the case level. Our model indicates that the former policy, prioritizing the oldest hearing, is opti...
متن کاملComputing an index policy for multiarmed bandits with deadlines
This paper introduces the multiarmed bandit problem with deadlines, which concerns the dynamic selection of a live project to engage out of a portfolio of Markovian bandit projects expiring after given deadlines, to maximize the expected total discounted or undiscounted reward earned. Although the problem is computationally intractable, a natural heuristic policy is obtained by attaching to eac...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: The Annals of Applied Statistics
سال: 2020
ISSN: 1932-6157
DOI: 10.1214/20-aoas1370