On No-Regret Learning, Fictitious Play, and Nash Equilibrium
نویسندگان
چکیده
This paper addresses the question what is the outcome of multi-agent learning via no-regret algorithms in repeated games? Speci cally, can the outcome of no-regret learning be characterized by traditional game-theoretic solution concepts, such as Nash equilibrium? The conclusion of this study is that no-regret learning is reminiscent of ctitious play: play converges to Nash equilibrium in dominancesolvable, constant-sum, and generalsum 2 2 games, but cycles exponentially in the Shapley game. Notably, however, the information required of ctitious play far exceeds that of noregret learning.
منابع مشابه
No-regret dynamics and fictitious play
Potential based no-regret dynamics are shown to be related to fictitious play. Roughly, these are ε-best reply dynamics where ε is the maximal regret, which vanishes with time. This allows for alternative and sometimes much shorter proofs of known results on convergence of no-regret dynamics to the set of Nash equilibria.
متن کاملFair and Efficient Solutions to the Santa Fe Bar Problem
This paper asks the question: can adaptive, but not necessarily rational, agents learn Nash equilibrium behavior in the Santa Fe Bar Problem? To answer this question, three learning algorithms are simulated: fictitious play, no-regret learning, and Q-learning. Conditions under which these algorithms can converge to equilibrium behavior are isolated. But it is noted that the pure strategy Nash e...
متن کاملA General Class of No-Regret Learning Algorithms and Game-Theoretic Equilibria
A general class of no-regret learning algorithms, called no-Φ-regret learning algorithms, is defined which spans the spectrum from no-external-regret learning to no-internal-regret learning and beyond. The set Φ describes the set of strategies to which the play of a given learning algorithm is compared. A learning algorithm satisfies no-Φ-regret if no regret is experienced for playing as the al...
متن کاملUnifying Convergence and No-Regret in Multiagent Learning
We present a new multiagent learning algorithm, RVσ(t), that builds on an earlier version, ReDVaLeR . ReDVaLeR could guarantee (a) convergence to best response against stationary opponents and either (b) constant bounded regret against arbitrary opponents, or (c) convergence to Nash equilibrium policies in self-play. But it makes two strong assumptions: (1) that it can distinguish between self-...
متن کاملDynamics in Games
The purpose of the course is to study several dynamics generated by strategic interactions in games. Among the topics are adaptive dynamics in evolutionary game theory, robust procedures for on-line algorithms and stochastic approximation. 1. Fictitious play Discrete time Continuous time and best reply dynamics 2. Replicator dynamics n populations One population Evolutionary Stable Strategies 3...
متن کامل