Options Discovery with Budgeted Reinforcement Learning

نویسندگان

  • Aurélia Léon
  • Ludovic Denoyer
چکیده

We consider the problem of learning hierarchical policies for Reinforcement Learning able to discover options, an option corresponding to a sub-policy over a set of primitive actions. Different models have been proposed during the last decade that usually rely on a predefined set of options. We specifically address the problem of automatically discovering options in decision processes. We describe a new learning model called Budgeted Option Neural Network (BONN) 1 able to discover options based on a budgeted learning objective. The BONN model is evaluated on different classical RL problems, demonstrating both quantitative and qualitative interesting results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved Automatic Discovery of Subgoals for Options in Hierarchical Reinforcement Learning

Options have been shown to be a key step in extending reinforcement learning beyond low-level reactionary systems to higher-level, planning systems. Most of the options research involves hand-crafted options; there has been only very limited work in the automated discovery of options. We extend early work in automated option discovery with a flexible and robust method.

متن کامل

PAC-inspired Option Discovery in Lifelong Reinforcement Learning

A key goal of AI is to create lifelong learning agents that can leverage prior experience to improve performance on later tasks. In reinforcement learning problems, one way to summarize prior experience for future use is through options, which are behaviorally extended actions (subpolicies) for how to behave. Options can then be used to potentially accelerate learning in new reinforcement learn...

متن کامل

A Laplacian Framework for Option Discovery in Reinforcement Learning

Representation learning and option discovery are two of the biggest challenges in reinforcement learning (RL). Proto-value functions (PVFs) are a well-known approach for representation learning in MDPs. In this paper we address the option discovery problem by showing how PVFs implicitly define options. We do it by introducing eigenpurposes, intrinsic reward functions derived from the learned re...

متن کامل

Eigenoption Discovery through the Deep Successor Representation

Options in reinforcement learning allow agents to hierarchically decompose a task into subtasks, having the potential to speed up learning and planning. However, autonomously learning effective sets of options is still a major challenge in the field. In this paper we focus on the recently introduced idea of using representation learning methods to guide the option discovery process. Specificall...

متن کامل

Supplemental Material: A Laplacian Framework for Option Discovery in Reinforcement Learning

• Supporting lemmas and their respective proofs, as well as a more detailed proof of Theorem 3.1; • Description of how to easily compute the diffusion time in tabular MDPs; • The options leading to bottleneck states (doorways) we used in our experiments; • Performance comparisons between eigenoptions and options generated to reach randomly selected states; • Demonstration of the applicability o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1611.06824  شماره 

صفحات  -

تاریخ انتشار 2016