نتایج جستجو برای: regret minimization
تعداد نتایج: 37822 فیلتر نتایج به سال:
Counterfactual Regret Minimization (CFR) has achieved many fascinating results in solving large-scale Imperfect Information Games (IIGs). Neural network approximation CFR (neural CFR) is one of the promising techniques that can reduce computation and memory consumption by generalizing decision information between similar states. Current neural algorithms have to approximate cumulative regrets. ...
We consider a stochastic multi-armed bandit setting and study the problem of constrained regret minimization over given time horizon. Each arm is associated with an unknown, possibly multi-dimensional distribution, merit determined by several, conflicting attributes. The aim to optimize ‘primary’ attribute subject user-provided constraints on other ‘secondary’ assume that attributes can be esti...
A new choice model is derived, rooted in the framework of Random Regret Minimization (RRM). The proposed model postulates that when choosing, people anticipate and aim to minimize regret. Whereas previous regret-based discrete choice-models assume that regret is experienced with respect to only the best of foregone alternatives, the proposed model assumes that regret is potentially experienced ...
This paper studies a variety of forms of regret minimization as the criteria with which traders choose their bids/asks in a double auction. Unlike the expected utility maximizers that populate typical market models, these traders do not determine their actions using a single prior. The analysis proves that minimax regret traders will not converge to price-taking as the number of traders in the ...
Counterfactual Regret Minimization (CFR) is an efficient no-regret learning algorithm for decision problems modeled as extensive games. CFR’s regret bounds depend on the requirement of perfect recall: players always remember information that was revealed to them and the order in which it was revealed. In games without perfect recall, however, CFR’s guarantees do not apply. In this paper, we pre...
Background: Reinforcement learning in complex games has traditionally been the domain of valueor policy iteration algorithms, resulting from their effectiveness in planning in Markov decision processes, before algorithms based on regret minimization guarantees such as upper confidence bounds applied to trees (UCT) and counterfactual regret minimization were developed and proved to be very succe...
Sequential decision-making with multiple agents and imperfect information is commonly modeled as an extensive game. One efficient method for computing Nash equilibria in large, zero-sum, imperfect information games is counterfactual regret minimization (CFR). In the domain of poker, CFR has proven effective, particularly when using a domain-specific augmentation involving chance outcome samplin...
We propose a novel baseline regret minimization algorithm for multi-agent planning problems modeled as finite-horizon decentralized POMDPs. It guarantees to produce a policy that is provably at least as good as a given baseline policy. We also propose an iterative belief generation algorithm to efficiently minimize the baseline regret, which only requires necessary iterations so as to converge ...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید