Reinforcement Learning for Penalty Avoiding Policy Making and its Extensions and an Application to the Othello Game
نویسنده
چکیده
The purpose of reinforcement learning system is to learn optimal policies in general. However, from the engineering point of view, it is useful and important to acquire not only optimal policies, but also penalty avoiding policies. In this paper, we are focused on formation of penalty avoiding policies based on the Penalty Avoiding Rational Policy Making algorithm [1]. In applying the algorithm to large-scale problems, we are confronted with the combinational explosion. To suppless the problem, especially the number of states, we introduce several ideas and heuristics. We implemented the proposed method as an Othello game player’s learning system. This learning player can always defeat against the well-known Othello game program KITTY [7] after learning.
منابع مشابه
Reinforcement Learning in 2-players Games
The purpose of reinforcement learning system is to learn an optimal policy in general. However, in 2players games such as the othello game, it is important to acquire a penalty avoiding policy. In this paper, we are focused on formation of penalty avoiding policies based on the Penalty Avoiding Rational Policy Making algorithm [2]. In applying it to large-scale problems, we are confronted with ...
متن کاملReinforcement learning for penalty avoiding policy making
Reinforcement Learning is a kind of machine learning. It aims to adapt an agent to a given environment with a clue to a reward. In general, the purpose of reinforcement learning system is to acquire an optimum policy that can maximize expected reward per an action. However, it is not always important for any environment. Especially, if we apply reinforcement learning system to engineering, we e...
متن کاملApplication of reinforcement learning to the game of Othello
Operations research and management science are often confronted with sequential decision making problems with large state spaces. Standard methods that are used for solving such complex problems are associated with some difficulties. As we discuss in this article, these methods are plagued by the so-called curse of dimensionality and the curse of modelling. In this article, we discuss reinforce...
متن کاملAn Adaptive Learning Game for Autistic Children using Reinforcement Learning and Fuzzy Logic
This paper, presents an adapted serious game for rating social ability in children with autism spectrum disorder (ASD). The required measurements are obtained by challenges of the proposed serious game. The proposed serious game uses reinforcement learning concepts for being adaptive. It is based on fuzzy logic to evaluate the social ability level of the children with ASD. The game adapts itsel...
متن کاملA JOINT DUTY CYCLE SCHEDULING AND ENERGY AWARE ROUTING APPROACH BASED ON EVOLUTIONARY GAME FOR WIRELESS SENSOR NETWORKS
Network throughput and energy conservation are two conflicting important performance metrics for wireless sensor networks. Since these two objectives are in conflict with each other, it is difficult to achieve them simultaneously. In this paper, a joint duty cycle scheduling and energy aware routing approach is proposed based on evolutionary game theory which is called DREG. Making a trade-off ...
متن کامل