Memorizing the Playout Policy
نویسندگان
چکیده
Monte Carlo Tree Search (MCTS) is the state of the art algorithm for General Game Playing (GGP). Playout Policy Adaptation with move Features (PPAF) is a state of the art MCTS algorithm that learns a playout policy online. We propose a simple modification to PPAF consisting in memorizing the learned policy from one move to the next. We test PPAF with memorization (PPAFM) against PPAF and UCT for Atarigo, Breakthrough, Misere Breakthrough, Domineering, Misere Domineering, Knightthrough, Misere Knightthrough and Nogo.
منابع مشابه
Playout policy adaptation with move features
Monte Carlo Tree Search (MCTS) is the state of the art algorithm for General Game Playing (GGP). We propose to learn a playout policy online so as to improve MCTS for GGP. We also propose to learn a policy not only using the moves but also according to the features of the moves. We test the resulting algorithms named Playout Policy Adaptation (PPA) and Playout Policy Adaptation with move Featur...
متن کاملPlayout Policy Adaptation for Games
Monte Carlo Tree Search (MCTS) is the state of the art algorithm for General Game Playing (GGP). We propose to learn a playout policy online so as to improve MCTS for GGP. We test the resulting algorithm named Playout Policy Adaptation (PPA) on Atarigo, Breakthrough, Misere Breakthrough, Domineering, Misere Domineering, Go, Knightthrough, Misere Knightthrough, Nogo and Misere Nogo. For most of ...
متن کاملJoint Power/Playout Control Schemes for Media Streaming over Wireless Links
We investigate transmission and playout policies for streaming media over a wireless link. In particular, we choose both the power at the transmitter and the playout rate at the receiver, in order to minimize the power consumption and maximize the media quality. We formulate the problem using a dynamic programming approach, study the structural properties of the optimal solution, develop justif...
متن کاملOptimization of a packet video receiver under different levels of delay jitter: an analytical approach
This paper studies the problem of analyzing and designing optimal playout adaptation policies for packet video receivers (PVRs) that operate in a delay jitter inducing best-effort network, like the current Internet. The developed system model is built around the Ek/Di/1/N phase-type queue and allows for the effective modeling of key design and system parameters, such as: the level of delay jitt...
متن کاملOn the impact of playout scheduling on the performance of peer-to-peer live streaming
In this paper we examine the impact of the adopted playout policy on the performance of P2P live streaming systems. We argue and demonstrate experimentally that (popular) playout policies which permit the divergence of the playout points of different nodes can deteriorate drastically the performance of P2P live streaming. Consequently, we argue in favor of keeping different playout points “near...
متن کامل