markov games

Softened Approximate Policy Iteration for Markov Games

2016

Julien Pérolat Bilal Piot Matthieu Geist Bruno Scherrer Olivier Pietquin

This paper reports theoretical and empirical investigations on the use of quasi-Newton methods to minimize the Optimal Bellman Residual (OBR) of zero-sum two-player Markov Games. First, it reveals that state-of-the-art algorithms can be derived by the direct application of Newton’s method to different norms of the OBR. More precisely, when applied to the norm of the OBR, Newton’s method results...

متن کامل

Cooperation through communication in decentralized Markov games

2004

Raghav Aras Alain Dutech François Charpillet

In this paper, we present a comunication-integrated reinforcement-learning algorithm for a general-sum Markov game or MG played by independent, cooperative agents. The algorithm assumes that agents can communicate but do not know the purpose (the semantic) of doing so. We model agents that have different tasks, some of which may be commonly beneficial. The objective of the agents is to determin...

متن کامل

Partially Observed, Multi-objective Markov Games

Journal: :CoRR 2013

Yanling Chang Alan L. Erera Chelsea C. White

The intent of this research is to generate a set of non-dominated policies from which one of two agents (the leader) can select a most preferred policy to control a dynamic system that is also affected by the control decisions of the other agent (the follower). The problem is described by an infinite horizon, partially observed Markov game (POMG). The actions of the agents are selected simultan...

متن کامل

Value-function reinforcement learning in Markov games

Journal: :Cognitive Systems Research 2001

Michael L. Littman

Markov games are a model of multiagent environments that are convenient for studying multiagent reinforcement learning. This paper describes a set of reinforcement-learning algorithms based on estimating value functions and presents convergence theorems for these algorithms. The main contribution of this paper is that it presents the convergence theorems in a way that makes it easy to reason ab...

متن کامل

Markov Perfect Equilibria in Stochastic Bequest Games

2014

A. S. Nowak

In many real-life situations, the preferences of an economic agent change over time. Rational behaviour of such agents was studied by many authors (Strotz, Pollak, Bernheim and Ray) who considered so-called “consistent plans”. Phelps and Pollak [10] introduced the notion of “quasi-hyperbolic discounting”, which is a modification of the classical discounting proposed in 1937 by Samuelson. Within...

متن کامل

Extraproximal Method for Markov Chains Finite Games

2008

S. Moya

In this paper a regularized version of the”extraproximal method” is suggested to be applied for finding a Nash equilibrium in a multiparticipant finite game where the dynamics of each player is governed by a finite controllable Markov chain. The suggested iterative technique realizes the application of a twostep procedure at each iteration: at the first (or preliminary) step some ”predictive ...

متن کامل

Cyclic Markov equilibria in stochastic games

Journal: :Int. J. Game Theory 1997

János Flesch Frank Thuijsman Koos Vrieze

We examine a three-person stochastic game where the only existing equi-libria consist of cyclic Markov strategies. Unlike in two-person games of a similar type, stationary "-equilibria (" > 0) do not exist for this game. Besides we characterize the set of feasible equilibrium rewards.

متن کامل

Policy Gradient Method for Team Markov Games

2004

Ville Könönen

The main aim of this paper is to extend the single-agent policy gradient method for multiagent domains where all agents share the same utility function. We formulate these team problems as Markov games endowed with the asymmetric equilibrium concept and based on this formulation, we provide a direct policy gradient learning method. In addition, we test the proposed method with a small example p...

متن کامل

Stochastic Mean Payoff Games: Smoothed Analysis and Approximation Schemes

2011

Endre Boros Khaled M. Elbassioni Mahmoud Fouz Vladimir Gurvich Kazuhisa Makino Bodo Manthey

We consider two-person zero-sum stochastic mean payoff games with perfect information modeled by a digraph with black, white, and random vertices. These BWR-games games are polynomially equivalent with the classical Gillette games, which include many well-known subclasses, such as cyclic games, simple stochastic games, stochastic parity games, and Markov decision processes. They can also be use...

متن کامل

Optimal Stopping Games for Markov Processes

Journal: :SIAM J. Control and Optimization 2008

Erik Ekström Goran Peskir

where the horizon T (the upper bound for τ and σ above) may be either finite or infinite (it is assumed that G1(XT ) = G2(XT ) if T is finite and lim inft→∞G2(Xt) ≤ lim supt→∞G1(Xt) if T is infinite). If X is right-continuous, then the Stackelberg equilibrium holds, in the sense that V ∗(x) = V∗(x) for all x with V := V ∗ = V∗ defining a measurable function. If X is right-continuous and left-co...

متن کامل