Fictitious Play in Zero-Sum Stochastic Games

نویسندگان

چکیده

We present a novel variant of fictitious play dynamics combining classical with $Q$-learning for stochastic games and analyze its convergence properties in two-player zero-sum games. Our involves players forming beliefs on the opponent strategy their own continuation payoff ($Q$-function), playing greedy best response by using estimated payoffs. Players update from observations actions. A key property learning is that $Q$-functions occurs at slower timescale than strategies. show both model-based model-free cases (without knowledge player functions state transition probabilities), strategies converge to stationary mixed Nash equilibrium game.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fictitious play in stochastic games

In this paper we examine an extension of the fictitious play process for bimatrix games to stochastic games. We show that the fictitious play process does not necessarily converge, not even in the 2 × 2 × 2 case with a unique equilibrium in stationary strategies. Here 2 × 2 × 2 stands for 2 players, 2 states, 2 actions for each player in each state.

متن کامل

A Weakened Form of Fictitious Play in Two-Person Zero-Sum Games

Fictitious play can be seen as a numeric iteration procedure for determining the value of a game and corresponding optimal strategies. Although convergence is slow, it needs only a modest computer storage. Therefore it seems to be a good way out for analyzing large games. In this paper we consider a weakened form of ...ctitious play, which can be interpreted that players at each stage do not ha...

متن کامل

Definable Zero-Sum Stochastic Games

Definable zero-sum stochastic games involve a finite number of states and action sets, reward and transition functions that are definable in an o-minimal structure. Prominent examples of such games are finite, semi-algebraic or globally subanalytic stochastic games. We prove that the Shapley operator of any definable stochastic game with separable transition and reward functions is definable in...

متن کامل

Fictitious play in coordination games

We study the Fictitious Play process with bounded and unbounded recall in pure coordination games for which failing to coordinate yields a payoff of zero for both players. It is shown that every Fictitious Play player with bounded recall may fail to coordinate against his own type. On the other hand, players with unbounded recall are shown to coordinate (almost surely) against their own type as...

متن کامل

A new class of Hamiltonian flows with random-walk behavior originating from zero-sum games and Fictitious Play

In this paper we relate dynamics associated to zero-sum games (Fictitious play) to Hamiltonian dynamics. It turns out that the Hamiltonian dynamics which is induced from fictitious play, has properties which are rather different from those found in more classically defined Hamiltonian dynamics. Although the vectorfield is piecewise constant (and so the flow φt piecewise a translation), the dyna...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Siam Journal on Control and Optimization

سال: 2022

ISSN: ['0363-0129', '1095-7138']

DOI: https://doi.org/10.1137/21m1426675