regret minimization

نتایج جستجو برای: regret minimization

تعداد نتایج: 37822 فیلتر نتایج به سال:

Regret minimization in repeated matrix games with variable stage duration

Journal: :Games and Economic Behavior 2008

Shie Mannor Nahum Shimkin

Regret minimization in repeated matrix games has been extensively studied ever since Hannan’s (1957) seminal paper. Several classes of no-regret strategies now exist; such strategies secure a longterm average payoff as high as could be obtained by the fixed action that is best, in hindsight, against the observed action sequence of the opponent. We consider an extension of this framework to repe...

متن کامل

Regret-Optimal Estimation and Control

Journal: :IEEE Transactions on Automatic Control 2023

We consider estimation and control in linear dynamical systems from the perspective of regret minimization. Unlike most prior work this area, we focus on problem designing causal state estimators controllers which compete against a clairvoyant noncausal policy, instead best policy selected hindsight some fixed parametric class. show that regret-optimal filters can be derived state-space form us...

متن کامل

Improved Algorithm for Regret Ratio Minimization in Multi-Objective Submodular Maximization

Journal: :Proceedings of the ... AAAI Conference on Artificial Intelligence 2023

Submodular maximization has attracted extensive attention due to its numerous applications in machine learning and artificial intelligence. Many real-world problems require maximizing multiple submodular objective functions at the same time. In such cases, a common approach is select representative subset of Pareto optimal solutions with different trade-offs among objectives. To this end, paper...

متن کامل

Policy Optimization as Online Learning with Mediator Feedback

Journal: :Proceedings of the ... AAAI Conference on Artificial Intelligence 2021

Policy Optimization (PO) is a widely used approach to address continuous control tasks. In this paper, we introduce the notion of mediator feedback that frames PO as an online learning problem over policy space. The additional available information, compared standard bandit feedback, allows reusing samples generated by one estimate performance other policies. Based on observation, propose algor...

متن کامل

System redundancy optimization with uncertain stress-based component reliability: Minimization of regret

Journal: :Reliability Engineering & System Safety 2016

متن کامل

Reduced Space and Faster Convergence in Imperfect-Information Games via Regret-Based Pruning

Journal: :CoRR 2016

Noam Brown Tuomas Sandholm

Counterfactual Regret Minimization (CFR) is the most popular iterative algorithm for solving zero-sum imperfect-information games. Regret-Based Pruning (RBP) is an improvement that allows poorly-performing actions to be temporarily pruned, thus speeding up CFR. We introduce Total RBP, a new form of RBP that reduces the space requirements of CFR as actions are pruned. We prove that in zero-sum g...

متن کامل

Online Submodular Minimization for Combinatorial Structures

2011

Stefanie Jegelka Jeff A. Bilmes

Most results for online decision problems with structured concepts, such as trees or cuts, assume linear costs. In many settings, however, nonlinear costs are more realistic. Owing to their non-separability, these lead to much harder optimization problems. Going beyond linearity, we address online approximation algorithms for structured concepts that allow the cost to be submodular, i.e., nonse...

متن کامل

Revenue Optimization in Posted-Price Auctions with Strategic Buyers

Journal: :CoRR 2014

Mehryar Mohri Andres Muñoz Medina

We study revenue optimization learning algorithms for posted-price auctions with strategic buyers. We analyze a very broad family of monotone regret minimization algorithms for this problem, which includes the previously best known algorithm, and show that no algorithm in that family admits a strategic regret more favorable than Ω( √ T ). We then introduce a new algorithm that achieves a strate...

متن کامل

Online Learning and Optimization of Markov Jump Affine Models

Journal: :CoRR 2016

Sevi Baltaoglu Lang Tong Qing Zhao

The problem of online learning and optimization of unknown Markov jump affine models is considered. An online learning policy, referred to as Markovian simultaneous perturbations stochastic approximation (MSPSA), is proposed for two different optimization objectives: (i) the quadratic cost minimization of the regulation problem and (ii) the revenue (profit) maximization problem. It is shown tha...

متن کامل

Regret-Based Pruning in Extensive-Form Games

2015

Noam Brown Tuomas Sandholm

Counterfactual Regret Minimization (CFR) is a leading algorithm for finding a Nash equilibrium in large zero-sum imperfect-information games. CFR is an iterative algorithm that repeatedly traverses the game tree, updating regrets at each information set. We introduce an improvement to CFR that prunes any path of play in the tree, and its descendants, that has negative regret. It revisits that s...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید