نتایج جستجو برای: regret analysis

تعداد نتایج: 2828405  

Journal: :CoRR 2016
Jianzhong Qi Fei Zuo Jia Cheng Yao

The k-regret queries aim to return a size-k subset of the entire database such that, for any query user that selects a data object in this size-k subset rather than in the entire database, her “regret ratio” is minimized. Here, the regret ratio is modeled by the level of difference in the optimality between the optimal object in the size-k subset returned and the optimal object in the entire da...

Journal: :CoRR 2013
Ziqiang Shi

Online and stochastic learning has emerged as powerful tool in large scale optimization. In this work, we generalize the Douglas-Rachford splitting (DRs) method for minimizing composite functions to online and stochastic settings (to our best knowledge this is the first time DRs been generalized to sequential version). We first establish an O(1/ √ T ) regret bound for batch DRs method. Then we ...

2017
Lihong Li Yu Lu Dengyong Zhou

Contextual bandits are widely used in Internet services from news recommendation to advertising, and to Web search. Generalized linear models (logistical regression in particular) have demonstrated stronger performance than linear models in many applications where rewards are binary. However, most theoretical analyses on contextual bandits so far are on linear bandits. In this work, we propose ...

2007
Jörg Stoye

Consider a decision maker who faces a number of possible models of the world. Every model generates objective probabilities, but no probabilities of models are given. This is the classic setting of statistical decision theory; recent and less standard applications include decision making with model uncertainty, e.g. due to concerns for misspecification, treatment choice with partial identificat...

Journal: :CoRR 2017
Daniel Russo David Tse Benjamin Van Roy

The literature on bandit learning and regret analysis has focused on contexts where the goalis to converge on an optimal action in a manner that limits exploration costs. One shortcomingimposed by this orientation is that it does not treat time preference in a coherent manner.Time preference plays an important role when the optimal action is costly to learn relative tonear-o...

2008
Stephen Lovelady

This research looks into the possibility of bringing together two distinct and empirically successful areas of behavioural economics; regret aversion and quasi-hyperbolic discounting. Standard regret aversion theory (Loomes and Sugden 1982, Bell 1982) operates in a world of uncertainty, where regret aversion arises due to the possibility of, having made an initial choice, a more preferable opti...

2016
Tianbao Yang

In this note, we study Nesterov’s accelerated gradient descent method in an online setting and establish both variational static and dynamic regret bounds using the functional variation, which “match” previous regret bounds in terms of gradient variation. To the best of our knowledge, this is the first work to study Nesterov’s accelerated gradient method in an online setting and our regret boun...

2017
Pratik Gajane Tanguy Urvoy Emilie Kaufmann

We study a variant of the stochastic multi-armed bandit (MAB) problem in which the rewards are corrupted. In this framework, motivated by privacy preservation in online recommender systems, the goal is to maximize the sum of the (unobserved) rewards, based on the observation of transformation of these rewards through a stochastic corruption process with known parameters. We provide a lower boun...

2014
Mehryar Mohri Andres Muñoz Medina

We study revenue optimization learning algorithms for posted-price auctions with strategic buyers. We analyze a very broad family of monotone regret minimization algorithms for this problem, which includes the previously best known algorithm, and show that no algorithm in that family admits a strategic regret more favorable than Ω( √ T ). We then introduce a new algorithm that achieves a strate...

Journal: :JNW 2013
Fangwei Li Yongchuan Tang Jiang Zhu

In order to improve the resource utilization in asymmetric wireless networks, a novel dynamic resource access algorithm was presented. As the asymmetry of information and the locality of users' actions in distributed wireless networks, the resource access problem was expressed as a simple graphical game model. Let the graphic topology indicate the internal game structure of the realistic enviro...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید