A Comparison of Neural Network Architectures in Reinforcement Learning in the Game of Othello
نویسنده
چکیده
Declaration This thesis contains no material which has been accepted for the award of any other degree or diploma in any tertiary institution, and to the best of my knowledge and belief, contains no material previously published or written by another person, except where due reference is made in the text of the thesis. Abstract Over the past two decades, Reinforcement Learning has emerged as a promising Machine Learning technique that is capable of solving complex dynamic problems. The benefit of this technique lies in the fact that the agent learns from its experience rather than being told directly. For problems with large state-spaces, Reinforcement Learning algorithms are combined with function approximation techniques, such as neural networks. The architecture of the neural networks plays a significant role in the agent's learning. Past research has demonstrated that networks with a constructive architecture outperform those with a fixed architecture on some benchmark problems. This study compares the performance of these two architectures in Othello – a complex deterministic board game. Three networks are used in the comparison: two with constructive architecture – Cascade and Resource Allocating Network, and one with fixed architecture-Multilayer Perceptron. Investigation is also made with respect to input representation, number of hidden nodes and other parameters used by the networks. Training is performed with both on-policy (Sarsa) and off-policy (Q-Learning) algorithms. Results show that agents were able to learn the positional strategy (novice strategy in Othello) and could beat each of the three built-in opponents. Agents trained with Multilayer Perceptron perform better, but converge slower than those trained with Cascade. iv Acknowledgments
منابع مشابه
Learning Othello using Cooperative and Competitive Neuroevolution
From early days in computing, making computers play games like chess and Othello with a high level of skill has been a challenging and, lately, rewarding task. As computing power becomes increasingly more powerful, more and more complex learning techniques are employed to allow computers to learn different tasks. Games, however, remain a challenging and exciting domain for testing new technique...
متن کاملAn Unsupervised Learning Method for an Attacker Agent in Robot Soccer Competitions Based on the Kohonen Neural Network
RoboCup competition as a great test-bed, has turned to a worldwide popular domains in recent years. The main object of such competitions is to deal with complex behavior of systems whichconsist of multiple autonomous agents. The rich experience of human soccer player can be used as a valuable reference for a robot soccer player. However, because of the differences between real and simulated soc...
متن کاملReinforcement Learning in Neural Networks: A Survey
In recent years, researches on reinforcement learning (RL) have focused on bridging the gap between adaptive optimal control and bio-inspired learning techniques. Neural network reinforcement learning (NNRL) is among the most popular algorithms in the RL framework. The advantage of using neural networks enables the RL to search for optimal policies more efficiently in several real-life applicat...
متن کاملReinforcement Learning in Neural Networks: A Survey
In recent years, researches on reinforcement learning (RL) have focused on bridging the gap between adaptive optimal control and bio-inspired learning techniques. Neural network reinforcement learning (NNRL) is among the most popular algorithms in the RL framework. The advantage of using neural networks enables the RL to search for optimal policies more efficiently in several real-life applicat...
متن کاملStrategy Acquisition for the Game "Othello" Based on Reinforcement Learning
This article discusses automatic strategy acquisition for the game \Othello" based on reinforcement learning. In our approach, two computer players initially know only the game rules, but they become relatively stronger after playing several thousands of games against each other. In each game, the players re ne the evaluation function for the game state, which is achieved in a reinforcement lea...
متن کامل