The Eigenoption-Critic Framework

نویسندگان

  • Miao Liu
  • Marlos C. Machado
  • Gerald Tesauro
  • Murray Campbell
چکیده

Eigenoptions (EOs) have been recently introduced as a promising idea for generating a diverse set of options through the graph Laplacian, having been shown to allow efficient exploration Machado et al. [2017a]. Despite its first initial promising results, a couple of issues in current algorithms limit its application, namely: 1) EO methods require two separate steps (eigenoption discovery and reward maximization) to learn a control policy, which can incur a significant amount of storage and computation; 2) EOs are only defined for problems with discrete state-spaces and; 3) it is not easy to take the environment’s reward function into consideration when discovering EOs. In this paper, we introduce an algorithm termed eigenoptioncritic (EOC) that addresses these issues. It is based on the Option-critic (OC) architecture Bacon et al. [2017], a general hierarchical reinforcement learning algorithm that allows learning the intra-option policies simultaneously with the policy over options. We also propose a generalization of EOC to problems with continuous state-spaces through the Nyström approximation. EOC can also be seen as extending OC to nonstationary settings, where the discovered options are not tailored for a single task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Actor-Critic based Training Framework for Abstractive Summarization

We present a training framework for neural abstractive summarization based on actor-critic approaches from reinforcement learning. In the traditional neural network based methods, the objective is only to maximize the likelihood of the predicted summaries, no other assessment constraints are considered, which may generate low-quality summaries or even incorrect sentences. To alleviate this prob...

متن کامل

Variance Adjusted Actor Critic Algorithms

We present an actor-critic framework for MDPs where the objective is the variance-adjusted expected return. Our critic uses linear function approximation, and we extend the concept of compatible features to the variance-adjusted setting. We present an episodic actor-critic algorithm and show that it converges almost surely to a locally optimal point of the objective function. Index Terms Reinfo...

متن کامل

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

Model-free deep reinforcement learning (RL) algorithms have been demonstrated on a range of challenging decision making and control tasks. However, these methods typically suffer from two major challenges: very high sample complexity and brittle convergence properties, which necessitate meticulous hyperparameter tuning. Both of these challenges severely limit the applicability of such methods t...

متن کامل

Adversarial Advantage Actor-Critic Model for Task-Completion Dialogue Policy Learning

This paper presents a new method — adversarial advantage actor-critic (Adversarial A2C), which significantly improves the efficiency of dialogue policy learning in taskcompletion dialogue systems. Inspired by generative adversarial networks (GAN), we train a discriminator to differentiate responses/actions generated by dialogue agents from responses/actions by experts. Then, we incorporate the ...

متن کامل

Learning to Learn: Meta-Critic Networks for Sample Efficient Learning

We propose a novel and flexible approach to meta-learning for learning-to-learn from only a few examples. Our framework is motivated by actor-critic reinforcement learning, but can be applied to both reinforcement and supervised learning. The key idea is to learn a meta-critic: an action-value function neural network that learns to criticise any actor trying to solve any specified task. For sup...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1712.04065  شماره 

صفحات  -

تاریخ انتشار 2017