نتایج جستجو برای: regret eating
تعداد نتایج: 57757 فیلتر نتایج به سال:
We present a general framework for analyzing regret in the online prediction problem. We develop this from sets of linear transformations of strategies. We establish relationships among the varieties of regret and present a class of regret-matching algorithms. Finally we consider algorithms that exhibit the asymptotic no-regret property. Our main results are an analysis of observed regret in ex...
Sensitive error correcting output codes are a reduction from cost sensitive classi cation to binary classi cation. They are a modi cation of error correcting output codes [3] which satisfy an additional property: regret for binary classi cation implies at most 2 l2 regret for cost-estimation. This has several implications: 1) Any 0/1 regret minimizing online algorithm is (via the reduction) a r...
This paper considers the stability of online learning algorithms and its implications for learnability (bounded regret). We introduce a novel quantity called forward regret that intuitively measures how good an online learning algorithm is if it is allowed a one-step look-ahead into the future. We show that given stability, bounded forward regret is equivalent to bounded regret. We also show th...
A general class of no-regret learning algorithms, called no-Φ-regret learning algorithms, is defined which spans the spectrum from no-external-regret learning to no-internal-regret learning and beyond. The set Φ describes the set of strategies to which the play of a given learning algorithm is compared. A learning algorithm satisfies no-Φ-regret if no regret is experienced for playing as the al...
We consider online content recommendation with implicit feedback through pairwise comparisons, formalized as the so-called dueling bandit problem. We study the dueling bandit problem in the Condorcet winner setting, and consider two notions of regret: the more well-studied strong regret, which is 0 only when both arms pulled are the Condorcet winner; and the less well-studied weak regret, which...
The optimization problem of general utility case is considered for countable state semi-Markov decision processes. The regret-utility function is introduced as a function of two variables, one is a target value and the other is a present value. We consider the expectation of the regret-utility function incured until the reaching time to a given absorbing set. In order to characterize the regret...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید