regret

نتایج جستجو برای: regret

تعداد نتایج: 5407 فیلتر نتایج به سال:

Conditional Swap Regret and Conditional Correlated Equilibrium

2014

Mehryar Mohri Scott Yang

We introduce a natural extension of the notion of swap regret, conditional swap regret, that allows for action modifications conditioned on the player’s action history. We prove a series of new results for conditional swap regret minimization. We present algorithms for minimizing conditional swap regret with bounded conditioning history. We further extend these results to the case where conditi...

متن کامل

An Overall Purchasing Process Model of Internet Buyers: The Role of Regret in Electronic Commerce

2012

Chuang-Chun Liu Chechen Liao I-Cheng Chang

This study focuses on the antecedents and consequences of Internet buyer regret in the overall purchasing process. We examine the roles that search effort, service-attribute evaluations, product-attribute evaluations and post-purchase price perceptions play in determining buyer regret and satisfaction in e-commerce. Furthermore, the study examines the consequences of regret and satisfaction in ...

متن کامل

Understanding the Reasons behind Anticipated Regret for Missing Regular Physical Activity

2016

Ryan E. Rhodes Chetan D. Mistry

Anticipated affective reactions to missing physical activity (PA), often labeled anticipated regret, has reliable evidence as a predictor of PA intention and behavior independent of other standard social cognitive constructs. Despite this evidence, the sources of regret are understudied and may come from many different reasons. The purpose of this study was to theme the reasons for why people r...

متن کامل

A Memory I Regret

Journal: :Kazoku syakaigaku kenkyu 1991

متن کامل

Regret-Matching Bounds Bounds for Regret-Matching Algorithms

2007

Amy Greenwald Zheng Li

We study a general class of learning algorithms, which we call regret-matching algorithms, along with a general framework for analyzing their performance in online (sequential) decision problems (ODPs). In each round of an ODP, an agent chooses a probabilistic action and receives a reward. The particular reward function that applies at any given round is not revealed until after the agent acts....

متن کامل

Response Regret

2005

Martin Zinkevich

The concept of regret is designed for the long-term interaction of multiple agents. However, most concepts of regret do not consider even the short-term consequences of an agent’s actions: e.g., how other agents may be “nice” to you tomorrow if you are “nice” to them today. For instance, an agent that always defects while playing the Prisoner’s Dilemma will never have any swap or external regre...

متن کامل

Bounds for Regret-Matching Algorithms

2006

Amy Greenwald Zheng Li Casey Marks

We introduce a general class of learning algorithms, regret-matching algorithms, and a regret-based framework for analyzing their performance in online decision problems. Our analytic framework is based on a set Φ of transformations over the set of actions. Specifically, we calculate a Φ-regret vector by comparing the average reward obtained by an agent over some finite sequence of rounds to th...

متن کامل

Games Instructor : Alex Slivkins Homework 1 : bandits with IID rewards

2016

Alex Slivkins

Problem 1: rewards from a small interval. Consider a version of the problem in which all the realized rewards are in the interval [12 , 1 2 + ] for some ∈ (0, 1 2). Define versions of UCB1 and Successive Elimination attain improved regret bounds (both logarithmic and root-T) that depend on the . Hint: Use a more efficient version of Hoeffding Inequality in the slides from the first lecture. It ...

متن کامل

Nonstochastic bandits: Countable decision set, unbounded costs and reactive environments

Journal: :Theor. Comput. Sci. 2008

Jan Poland

The nonstochastic multi-armed bandit problem, first studied by Auer, Cesa-Bianchi, Freund, and Schapire in 1995, is a game of repeatedly choosing one decision from a set of decisions (“experts”), under partial observation: In each round t , only the cost of the decision played is observable. A regret minimization algorithm plays this game while achieving sublinear regret relative to each decisi...

متن کامل

Learning to Minimise Regret in Route Choice

2017

Gabriel de Oliveira Ramos Bruno Castro da Silva Ana L. C. Bazzan

Reinforcement learning (RL) is a challenging task, especially in highly competitive multiagent scenarios. We consider the route choice problem, in which self-interested drivers aim at choosing routes that minimise their travel times. Employing RL here is challenging because agents must adapt to each others’ decisions. In this paper, we investigate how agents can overcome such condition by minim...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید