نتایج جستجو برای: regret analysis

تعداد نتایج: 2828405  

2017
Wataru Kumagai

The dueling bandit is a learning framework wherein the feedback information in the learning process is restricted to a noisy comparison between a pair of actions. In this research, we address a dueling bandit problem based on a cost function over a continuous space. We propose a stochastic mirror descent algorithm and show that the algorithm achieves an O( √ T log T )-regret bound under strong ...

2017
Yumiko Kuraoka Kazuhiro Nakayama

BACKGROUND A tube feeding decision aid designed at the Ottawa Health Research Institute was specifically created for substitute decision-makers who must decide whether to allow placement of a percutaneous endoscopic gastrostomy (PEG) tube in a cognitively impaired older person. We developed a Japanese version and found that the decision aid promoted the decision-making process of substitute dec...

2006
Shai Shalev-Shwartz Yoram Singer

We describe and analyze an algorithmic framework for playing convex repeatedgames. In each trial of the repeated game, the first player predicts a vector andthen the second player responds with a loss function over the vector. Based on ageneralization of Fenchel duality, we derive an algorithmic framework for the firstplayer and analyze the player’s regret. We then use our a...

2005

Sensitive error correcting output codes are a reduction from cost sensitive classi cation to binary classi cation. They are a modi cation of error correcting output codes [3] which satisfy an additional property: regret for binary classi cation implies at most 2 l2 regret for cost-estimation. This has several implications: 1) Any 0/1 regret minimizing online algorithm is (via the reduction) a r...

Journal: :Machine Learning 2022

We study the problem of online kernel selection under computational constraints, where memory or time and prediction procedures is restricted to a fixed budget. In this paper, we analyze worst-case lower bounds on regret algorithm with subset observed examples, design algorithms enjoying corresponding upper bounds. also identify condition which constraints different from that constraints. To al...

2003
Amy Greenwald Amir Jafari

A general class of no-regret learning algorithms, called no-Φ-regret learning algorithms, is defined which spans the spectrum from no-external-regret learning to no-internal-regret learning and beyond. The set Φ describes the set of strategies to which the play of a given learning algorithm is compared. A learning algorithm satisfies no-Φ-regret if no regret is experienced for playing as the al...

2017
Bangrui Chen Peter I. Frazier

We consider online content recommendation with implicit feedback through pairwise comparisons, formalized as the so-called dueling bandit problem. We study the dueling bandit problem in the Condorcet winner setting, and consider two notions of regret: the more well-studied strong regret, which is 0 only when both arms pulled are the Condorcet winner; and the less well-studied weak regret, which...

Journal: :Journal of Medical Ethics 2012

Journal: :Techniques in Vascular and Interventional Radiology 2018

Journal: :British Journal of Surgery 2020

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید