نتایج جستجو برای: regret minimization

تعداد نتایج: 37822  

Journal: :Games and Economic Behavior 2008
Shie Mannor Nahum Shimkin

Regret minimization in repeated matrix games has been extensively studied ever since Hannan’s (1957) seminal paper. Several classes of no-regret strategies now exist; such strategies secure a longterm average payoff as high as could be obtained by the fixed action that is best, in hindsight, against the observed action sequence of the opponent. We consider an extension of this framework to repe...

Journal: :IEEE Transactions on Automatic Control 2023

We consider estimation and control in linear dynamical systems from the perspective of regret minimization. Unlike most prior work this area, we focus on problem designing causal state estimators controllers which compete against a clairvoyant noncausal policy, instead best policy selected hindsight some fixed parametric class. show that regret-optimal filters can be derived state-space form us...

Journal: :Proceedings of the ... AAAI Conference on Artificial Intelligence 2023

Submodular maximization has attracted extensive attention due to its numerous applications in machine learning and artificial intelligence. Many real-world problems require maximizing multiple submodular objective functions at the same time. In such cases, a common approach is select representative subset of Pareto optimal solutions with different trade-offs among objectives. To this end, paper...

Journal: :Proceedings of the ... AAAI Conference on Artificial Intelligence 2021

Policy Optimization (PO) is a widely used approach to address continuous control tasks. In this paper, we introduce the notion of mediator feedback that frames PO as an online learning problem over policy space. The additional available information, compared standard bandit feedback, allows reusing samples generated by one estimate performance other policies. Based on observation, propose algor...

Journal: :CoRR 2016
Noam Brown Tuomas Sandholm

Counterfactual Regret Minimization (CFR) is the most popular iterative algorithm for solving zero-sum imperfect-information games. Regret-Based Pruning (RBP) is an improvement that allows poorly-performing actions to be temporarily pruned, thus speeding up CFR. We introduce Total RBP, a new form of RBP that reduces the space requirements of CFR as actions are pruned. We prove that in zero-sum g...

2011
Stefanie Jegelka Jeff A. Bilmes

Most results for online decision problems with structured concepts, such as trees or cuts, assume linear costs. In many settings, however, nonlinear costs are more realistic. Owing to their non-separability, these lead to much harder optimization problems. Going beyond linearity, we address online approximation algorithms for structured concepts that allow the cost to be submodular, i.e., nonse...

Journal: :CoRR 2014
Mehryar Mohri Andres Muñoz Medina

We study revenue optimization learning algorithms for posted-price auctions with strategic buyers. We analyze a very broad family of monotone regret minimization algorithms for this problem, which includes the previously best known algorithm, and show that no algorithm in that family admits a strategic regret more favorable than Ω( √ T ). We then introduce a new algorithm that achieves a strate...

Journal: :CoRR 2016
Sevi Baltaoglu Lang Tong Qing Zhao

The problem of online learning and optimization of unknown Markov jump affine models is considered. An online learning policy, referred to as Markovian simultaneous perturbations stochastic approximation (MSPSA), is proposed for two different optimization objectives: (i) the quadratic cost minimization of the regulation problem and (ii) the revenue (profit) maximization problem. It is shown tha...

2015
Noam Brown Tuomas Sandholm

Counterfactual Regret Minimization (CFR) is a leading algorithm for finding a Nash equilibrium in large zero-sum imperfect-information games. CFR is an iterative algorithm that repeatedly traverses the game tree, updating regrets at each information set. We introduce an improvement to CFR that prunes any path of play in the tree, and its descendants, that has negative regret. It revisits that s...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید