An Architecture for Behavior-Based Reinforcement Learning
نویسندگان
چکیده
This paper introduces an integration of reinforcement learning and behavior-based control designed to produce real-time learning in situated agents. The model layers a distributed and asynchronous reinforcement learning algorithm over a learned topological map and standard behavioral substrate to create a reinforcement learning complex. The topological map creates a small and task-relevant state space that aims to make learning feasible, while the distributed and asynchronous aspects of the architecture make it compatible with behavior-based design principles. We present the design, implementation and results of an experiment that requires a mobile robot to perform puck foraging in three artificial arenas using the new model, random decision making, and layered standard reinforcement learning. The results show that our model is able to learn rapidly on a real robot in a real environment, learning and adapting to change more quickly than both alternatives. We show that the robot is able to make the best choices it can given its drives and experiences using only local decisions and therefore displays planning behavior without the use of classical planning techniques.
منابع مشابه
Reinforcement Learning Based PID Control of Wind Energy Conversion Systems
In this paper an adaptive PID controller for Wind Energy Conversion Systems (WECS) has been developed. Theadaptation technique applied to this controller is based on Reinforcement Learning (RL) theory. Nonlinearcharacteristics of wind variations as plant input, wind turbine structure and generator operational behaviordemand for high quality adaptive controller to ensure both robust stability an...
متن کاملOperation Scheduling of MGs Based on Deep Reinforcement Learning Algorithm
: In this paper, the operation scheduling of Microgrids (MGs), including Distributed Energy Resources (DERs) and Energy Storage Systems (ESSs), is proposed using a Deep Reinforcement Learning (DRL) based approach. Due to the dynamic characteristic of the problem, it firstly is formulated as a Markov Decision Process (MDP). Next, Deep Deterministic Policy Gradient (DDPG) algorithm is presented t...
متن کاملAn Unsupervised Learning Method for an Attacker Agent in Robot Soccer Competitions Based on the Kohonen Neural Network
RoboCup competition as a great test-bed, has turned to a worldwide popular domains in recent years. The main object of such competitions is to deal with complex behavior of systems whichconsist of multiple autonomous agents. The rich experience of human soccer player can be used as a valuable reference for a robot soccer player. However, because of the differences between real and simulated soc...
متن کاملAutomatic ei Sriclhar ahadevan an Jonathan Connell
This paper describes a general approach for automatically programming a behavior-based robot. New behaviors are learned by trial and error using a performance feedback function as reinforcement. Two algorithms for behavior learning are described that combine techniques for propagating reinforcement values temporally across actions and spatially across states. A behavior-based robot called OBELI...
متن کاملAutomatic ei
This paper describes a general approach for automatically programming a behavior-based robot. New behaviors are learned by trial and error using a performance feedback function as reinforcement. Two algorithms for behavior learning are described that combine techniques for propagating reinforcement values temporally across actions and spatially across states. A behavior-based robot called OBELI...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Adaptive Behaviour
دوره 13 شماره
صفحات -
تاریخ انتشار 2005