Convergent Multiple-timescales Reinforcement Learning Algorithms in Normal Form Games

نویسندگان

  • DAVID S. LESLIE
  • E. J. COLLINS
چکیده

We consider reinforcement learning algorithms in normal form games. Using two-timescales stochastic approximation, we introduce a modelfree algorithm which is asymptotically equivalent to the smooth fictitious play algorithm, in that both result in asymptotic pseudotrajectories to the flow defined by the smooth best response dynamics. Both of these algorithms are shown to converge almost surely to Nash distribution in twoplayer zero-sum games and N-player partnership games. However, there are simple games for which these, and most other adaptive processes, fail to converge—in particular, we consider the N-player matching pennies game and Shapley’s variant of the rock–scissors–paper game. By extending stochastic approximation results to multiple timescales we can allow each player to learn at a different rate. We show that this extension will converge for two-player zero-sum games and two-player partnership games, as well as for the two special cases we consider.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Convergent Multiple-times-scales Reinforcement Learning Algorithms in Normal Form Games

We consider reinforcement learning algorithms in normal form games. Using two-time-scales stochastic approximation, we introduce a modelfree algorithm which is asymptotically equivalent to the smooth fictitious play algorithm, in that both result in asymptotic pseudotrajectories to the flow defined by the smooth best response dynamics. Both of these algorithms are shown to converge almost surel...

متن کامل

Mulitagent Reinforcement Learning in Stochastic Games with Continuous Action Spaces

We investigate the learning problem in stochastic games with continuous action spaces. We focus on repeated normal form games, and discuss issues in modelling mixed strategies and adapting learning algorithms in finite-action games to the continuous-action domain. We applied variable resolution techniques to two simple multi-agent reinforcement learning algorithms PHC and MinimaxQ. Preliminary ...

متن کامل

Bifurcation Analysis of Reinforcement Learning Agents in the Selten's Horse Game

The application of reinforcement learning algorithms to multiagent domains may cause complex non-convergent dynamics. The replicator dynamics, commonly used in evolutionary game theory, proved to be effective for modeling the learning dynamics in normal form games. Nonetheless, it is often interesting to study the robustness of the learning dynamics when either learning or structural parameters...

متن کامل

Designing Learning Algorithms over the Sequence Form of an Extensive-Form Game

We focus on multi-agent learning over extensive-form games. When designing algorithms for extensive-form games, it is common the resort to tabular representations (i.e., normal form, agent form, and sequence form). Each representation provides some advantages and suffers from some drawbacks and it is not known which representation, if any, is the best one in multi-agent learning. In particular,...

متن کامل

Rational and Convergent Learning in Stochastic Games

This paper investigates the problem of policy learning in multiagent environments using the stochastic game framework, which we briefly overview. We introduce two properties as desirable for a learning agent when in the presence of other learning agents, namely rationality and convergence. We examine existing reinforcement learning algorithms according to these two properties and notice that th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002