نتایج جستجو برای: regret minimization

تعداد نتایج: 37822  

2015
Ying Liu Leonard N. Stern

We consider the pricing problem faced by a monopolist who sells a product to a population of consumers over a discrete number of periods. Customers are heterogeneous in both the willingness-to-pay for the product and the arrival time during the selling season. We assume that the seller knows only the support of the customers’ valuations and do not make any other distributional assumptions about...

2007
Ramesh Johari

Calibration is a concept that tries to formalize a notion of quality for forecasters. For example, suppose a weatherman predicts each day whether the it will rain, or be sunny. Typically forecasters will predict such events in terms of probabilities, i.e., “There is a 30% chance of rain.” Given only the outcome that day, it is impossible to judge the quality of such a forecast. However, if we c...

Journal: :Decision Analysis 2016
Juan Sebastian Borrero Oleg A. Prokopyev Denis Sauré

We study sequential interdiction when the interdictor has incomplete initial information about the network, and the evader has complete knowledge of the network, including its structure and arc costs. In each time period, the interdictor blocks at most k arcs from the network observed up to that period, after which the evader travels along a shortest path between two (fixed) nodes in the interd...

2017
Alexander Rakhlin Karthik Sridharan

We study an equivalence of (i) deterministic pathwise statements appearing in the online learning literature (termed regret bounds), (ii) high-probability tail bounds for the supremum of a collection of martingales (of a specific form arising from uniform laws of large numbers for martingales), and (iii) in-expectation bounds for the supremum. By virtue of the equivalence, we prove exponential ...

2017
Noam Brown Tuomas Sandholm

Iterative algorithms such as Counterfactual Regret Minimization (CFR) are the most popular way to solve large zero-sum imperfect-information games. In this paper we introduce Best-Response Pruning (BRP), an improvement to iterative algorithms such as CFR that allows poorly-performing actions to be temporarily pruned. We prove that when using CFR in zero-sum games, adding BRP will asymptotically...

2011
Todd W. Neller Steven Hnath

Using the bluffing dice game Dudo as a challenge domain, we abstract information sets using imperfect recall of actions. Even with such abstraction, the standard Counterfactual Regret Minimization (CFR) algorithm proves impractical for Dudo, with the number of recursive visits to the same abstracted information sets increasing exponentially with the depth of the game graph. By holding strategie...

2011
Michael Kaisers Daan Bloembergen Karl Tuyls

The number of proposed reinforcement learning algorithms appears to be ever-growing. This article tackles the diversification by showing a persistent principle in several independent reinforcement learning algorithms that have been applied to multi-agent settings. While their learning structure may look very diverse, algorithms such as Gradient Ascent, Cross learning, variations of Q-learning a...

2009
Shai Shalev-Shwartz Ohad Shamir Nathan Srebro Karthik Sridharan

For supervised classification problems, it is well known that learnability is equivalent to uniform convergence of the empirical risks and thus to learnability by empirical minimization. Inspired by recent regret bounds for online convex optimization, we study stochastic convex optimization, and uncover a surprisingly different situation in the more general setting: although the stochastic conv...

2011
Jacob D. Abernethy Peter L. Bartlett Elad Hazan

We consider the celebrated Blackwell Approachability Theorem for two-player games with vector payoffs. Blackwell himself previously showed that the theorem implies the existence of a “noregret” algorithm for a simple online learning problem. We show that this relationship is in fact much stronger, that Blackwell’s result is equivalent to, in a very strong sense, the problem of regret minimizati...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید