Cooperative/Competitive Behavior Acquisition Based on State Value Estimation of Others

نویسندگان

  • Kentarou Noma
  • Yasutake Takahashi
  • Minoru Asada
چکیده

The existing reinforcement learning approaches have been suffering from the curse of dimension problem when they are applied to multiagent dynamic environments. One of the typical examples is a case of RoboCup competitions since other agents and their behaviors easily cause state and action space explosion. This paper presents a method of modular learning in a multiagent environment by which the learning agent can acquire cooperative behaviors with its team mates and competitive ones against its opponents. The key ideas to resolve the issue are as follows. First, a two-layer hierarchical system with multi learning modules is adopted to reduce the size of the sensor and action spaces. The state space of the top layer consists of the state values from the lower level, and the macro actions are used to reduce the size of the physical action space. Second, the state of the other to what extent it is close to its own goal is estimated by observation and used as a state value in the top layer state space to realize the cooperative/competitive behaviors. The method is applied to 4 (defence team) on 5 (offence team) game task, and the learning agent successfully acquired the teamwork plays (pass and shoot) within much shorter learning time (30 times quicker than the earlier work).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Behavior Learning Based on State Value Estimation of Self and Others

The existing reinforcement learning methods have been seriously suffering from the curse of dimension problem especially when they are applied to multiagent dynamic environments. One of the typical examples is a case of RoboCup competitions since other agents and their behavior easily cause state and action space explosion. This paper presents a method of modular learning in a multiagent enviro...

متن کامل

Comparison of dyad training method with cooperative and competitive approach in the learning of Basketball Free Throw

Abstract This study aimed to Comparison of dyad training method with cooperative and competitive approach in the learning of Basketball Free Throw. The study sample included 24 girl’s students aged 13-15 years old who had no experience in free throw. Based on pretest scores, the participants were assigned to either the cooperative dyad training group or the competitive dyad training group. In ...

متن کامل

Change Point Estimation of the Stationary State in Auto Regressive Moving Average Models, Using Maximum Likelihood Estimation and Singular Value Decomposition-based Filtering

In this paper, for the first time, the subject of change point estimation has been utilized in the stationary state of auto regressive moving average (ARMA) (1, 1). In the monitoring phase, in case the features of the question pursue a time series, i.e., ARMA(1,1), on the basis of the maximum likelihood technique, an approach will be developed for the estimation of the stationary state’s change...

متن کامل

Error Modeling in Distribution Network State Estimation Using RBF-Based Artificial Neural Network

State estimation is essential to access observable network models for online monitoring and analyzing of power systems. Due to the integration of distributed energy resources and new technologies, state estimation in distribution systems would be necessary. However, accurate input data are essential for an accurate estimation along with knowledge on the possible correlation between the real and...

متن کامل

Cooperative Behavior Acquisition in Multi Mobile Robots Environment by Reinforcement Learning Based on State Vector Estimation

This paper proposes a method that acquires the purposive behaviors based on the estimation of the state vectors. In order to acquire the cooperative behaviors in multi robot environments, each learning robot estimates local predictive model between the learner and the other objects separately. Based on the local predictive models, robots learn the desired behaviors using reinforcement learning....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007