Optimal strategies for adaptive zero-sum average Markov games
نویسندگان
چکیده
منابع مشابه
The Distribution of Optimal Strategies in Symmetric Zero-sum Games
Given a skew-symmetric matrix, the corresponding two-player symmetric zero-sum game is defined as follows: one player, the row player, chooses a row and the other player, the column player, chooses a column. The payoff of the row player is given by the corresponding matrix entry, the column player receives the negative of the row player. A randomized strategy is optimal if it guarantees an expe...
متن کاملOn the equivalence of two expected average reward criteria for zero-sum semi-Markov games
In this paper we study two basic optimality criteria used in the theory of zero-sum semi-Markov games. According to the first one, the average reward for player 1 is the lim sup of the expected total rewards over a finite number of jumps divided by the expected cumulative time of these jumps. According to the second definition, the average reward (for player 1) is the lim sup of the expected to...
متن کاملReinforcement Learning for Average Reward Zero-Sum Games
We consider Reinforcement Learning for average reward zerosum stochastic games. We present and analyze two algorithms. The first is based on relative Q-learning and the second on Q-learning for stochastic shortest path games. Convergence is proved using the ODE (Ordinary Differential Equation) method. We further discuss the case where not all the actions are played by the opponent with comparab...
متن کاملOptimal strategies for equal-sum dice games
In this paper we consider a non-cooperative two-person zero-sum matrix game, called dice game. In an (n, σ) dice game, two players can independently choose a dice from a collection of hypothetical dice having n faces and with a total of σ eyes distributed over these faces. They independently roll their dice and the player showing the highest number of eyes wins (in case of a tie, none of the pl...
متن کاملSampling Techniques for Markov Games Approximation Results on Sampling Techniques for Zero-sum, Discounted Markov Games
We extend the “policy rollout” sampling technique for Markov decision processes to Markov games, and provide an approximation result guaranteeing that the resulting sampling-based policy is closer to the Nash equilibrium than the underlying base policy. This improvement is achieved with an amount of sampling that is independent of the state-space size. We base our approximation result on a more...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Mathematical Analysis and Applications
سال: 2013
ISSN: 0022-247X
DOI: 10.1016/j.jmaa.2012.12.011