A model-based approximate λ-policy iteration approach to online evasive path planning and the video game Ms. Pac-Man
نویسندگان
چکیده
This paper presents a model-based approximate λ-policy iteration approach using temporal differences for optimizing paths online for a pursuit-evasion problem, where an agent must visit several target positions within a region of interest while simultaneously avoiding one or more actively pursuing adversaries. This method is relevant to applications, such as robotic path planning, mobile-sensor applications, and path exposure. The methodology described utilizes cell decomposition to construct a decision tree and implements a temporal difference-based approximate λ-policy iteration to combine online learning with prior knowledge through modeling to achieve the objectives of minimizing the risk of being caught by an adversary and maximizing a reward associated with visiting target locations. Online learning and frequent decision tree updates allow the algorithm to quickly adapt to unexpected movements by the adversaries or dynamic environments. The approach is illustrated through a modified version of the video game Ms. Pac-Man, which is shown to be a benchmark example of the pursuit-evasion problem. The results show that the approach presented in this paper outperforms several other methods as well as most human players.
منابع مشابه
A Model-Based Approach to Optimizing Ms. Pac-Man Game Strategies in Real Time
This paper presents a model-based approach for computing real-time optimal decision strategies in the pursuitevasion game of Ms. Pac-Man. The game of Ms. Pac-Man is an excellent benchmark problem of pursuit-evasion game with multiple, active adversaries that adapt their pursuit policies based on Ms. Pac-Man’s state and decisions. In addition to evading the adversaries, the agent must pursue mul...
متن کاملA learning algorithm based on $λ$-policy iteration and its application to the video game "tetris attack"
We present an application of the λ -policy iteration, an algorithm based on neuro-dynamic programming (described by Bertsekas and Tsitsiklis [BT96]) to the video game Tetris Attack in the form of an automated player. To this end, we ®rst introduce the theoretical foundations underlying the method and model the game as a dynamic programming problem. Afterwards, we perform multiple experiments us...
متن کاملA Cellular Automaton Based Controller for a Ms. Pac-Man Agent
Video games can be used as an excellent test bed for Artificial Intelligence (AI) techniques. They are challenging and non-deterministic, this makes it very difficult to write strong AI players. An example of such a video game is Ms. Pac-Man. In this paper I will outline some of the previous techniques used to build AI controllers for Ms. Pac-Man as well as presenting a new and novel solution. ...
متن کاملEnhancements for Monte-Carlo Tree Search in Ms Pac-Man
In this paper enhancements for the Monte-Carlo Tree Search (MCTS) framework are investigated to play Ms Pac-Man. MCTS is used to find an optimal path for an agent at each turn, determining the move to make based on randomised simulations. Ms Pac-Man is a real-time arcade game, in which the protagonist has several independent goals but no conclusive terminal state. Unlike games such as Chess or ...
متن کاملRoomba Pac-Man: Teaching Autonomous Robotics through Embodied Gaming
We present an approach to teaching autonomous robotics to upper-level undergraduates through the medium of embodied games. As part of a developing course at Brown University, we have created the Roomba Pac-Man task to introduce students to different approaches to autonomous robot control in the context of a specific task. Roomba Pac-Man has been developed using commodity hardware from which stu...
متن کامل