Reinforcement Learning for Penalty Avoiding Policy Making and its Extensions and an Application to the Othello Game

نویسنده

  • Kazuteru Miyazaki
چکیده

The purpose of reinforcement learning system is to learn optimal policies in general. However, from the engineering point of view, it is useful and important to acquire not only optimal policies, but also penalty avoiding policies. In this paper, we are focused on formation of penalty avoiding policies based on the Penalty Avoiding Rational Policy Making algorithm [1]. In applying the algorithm to large-scale problems, we are confronted with the combinational explosion. To suppless the problem, especially the number of states, we introduce several ideas and heuristics. We implemented the proposed method as an Othello game player’s learning system. This learning player can always defeat against the well-known Othello game program KITTY [7] after learning.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reinforcement Learning in 2-players Games

The purpose of reinforcement learning system is to learn an optimal policy in general. However, in 2players games such as the othello game, it is important to acquire a penalty avoiding policy. In this paper, we are focused on formation of penalty avoiding policies based on the Penalty Avoiding Rational Policy Making algorithm [2]. In applying it to large-scale problems, we are confronted with ...

متن کامل

Reinforcement learning for penalty avoiding policy making

Reinforcement Learning is a kind of machine learning. It aims to adapt an agent to a given environment with a clue to a reward. In general, the purpose of reinforcement learning system is to acquire an optimum policy that can maximize expected reward per an action. However, it is not always important for any environment. Especially, if we apply reinforcement learning system to engineering, we e...

متن کامل

Application of reinforcement learning to the game of Othello

Operations research and management science are often confronted with sequential decision making problems with large state spaces. Standard methods that are used for solving such complex problems are associated with some difficulties. As we discuss in this article, these methods are plagued by the so-called curse of dimensionality and the curse of modelling. In this article, we discuss reinforce...

متن کامل

An Adaptive Learning Game for Autistic Children using Reinforcement Learning and Fuzzy Logic

This paper, presents an adapted serious game for rating social ability in children with autism spectrum disorder (ASD). The required measurements are obtained by challenges of the proposed serious game. The proposed serious game uses reinforcement learning concepts for being adaptive. It is based on fuzzy logic to evaluate the social ability level of the children with ASD. The game adapts itsel...

متن کامل

A JOINT DUTY CYCLE SCHEDULING AND ENERGY AWARE ROUTING APPROACH BASED ON EVOLUTIONARY GAME FOR WIRELESS SENSOR NETWORKS

Network throughput and energy conservation are two conflicting important performance metrics for wireless sensor networks. Since these two objectives are in conflict with each other, it is difficult to achieve them simultaneously. In this paper, a joint duty cycle scheduling and energy aware routing approach is proposed based on evolutionary game theory which is called DREG. Making a trade-off ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000