Feudal Reinforcement Learning

نویسندگان

  • Peter Dayan
  • Geoffrey E. Hinton
چکیده

One way to speed up reinforcement learning is to enable learning to happen simultaneously at multiple resolutions in space and time. This paper shows how to create a Q-learning managerial hierarchy in which high level managers learn how to set tasks to their sub-managers who, in turn, learn how to satisfy them. Sub-managers need not initially understand their managers’ commands. They simply learn to maximise their reinforcement in the context of the current command. We illustrate the system using a simple maze task.. As the system learns how to get around, satisfying commands at the multiple levels, it explores more efficiently than standard, flat, Q-learning and builds a more comprehensive map.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

FeUdal Networks for Hierarchical Reinforcement Learning

We introduce FeUdal Networks (FuNs): a novel architecture for hierarchical reinforcement learning. Our approach is inspired by the feudal reinforcement learning proposal of Dayan and Hinton, and gains power and efficacy by decoupling end-to-end learning across multiple levels – allowing it to utilise different resolutions of time. Our framework employs a Manager module and a Worker module. The ...

متن کامل

Using Hierarchical Reinforcement Learning to Balance Conflicting Sub-Problems

This paper describes the adaption and application of an algorithm called Feudal Reinforcement Learning to a complex gridworld navigation problem. The algorithm proved to be not easily adaptable and the results were unsatisfactory.

متن کامل

Feudal Reinforcement Learning for Dialogue Management in Large Domains

Reinforcement learning (RL) is a promising approach to solve dialogue policy optimisation. Traditional RL algorithms, however, fail to scale to large domains due to the curse of dimensionality. We propose a novel Dialogue Management architecture, based on Feudal RL, which decomposes the decision into two steps; a first step where a master policy selects a subset of primitive actions, and a seco...

متن کامل

Feudal Q-learning

One popular way of exorcising the ddmon of dimensionality in dynamic programming is to consider spatial and temporal hierarchies for representing the value functions and policies. This paper develops a hierarchical method for Q-learning which is based on the familiar notion of a recursive feudal serfdom, with managers setting tasks and giving rewards and punishments to their juniors and in thei...

متن کامل

Multicast Routing in Wireless Sensor Networks: A Distributed Reinforcement Learning Approach

Wireless Sensor Networks (WSNs) are consist of independent distributed sensors with storing, processing, sensing and communication capabilities to monitor physical or environmental conditions. There are number of challenges in WSNs because of limitation of battery power, communications, computation and storage space. In the recent years, computational intelligence approaches such as evolutionar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1992