Sequential Constant Size Compressors for Reinforcement Learning

نویسندگان

  • Linus Gisslén
  • Matthew D. Luciw
  • Vincent Graziano
  • Jürgen Schmidhuber
چکیده

Traditional Reinforcement Learning methods are insufficient for AGIs who must be able to learn to deal with Partially Observable Markov Decision Processes. We investigate a novel method for dealing with this problem: standard RL techniques using as input the hidden layer output of a Sequential Constant-Size Compressor (SCSC). The SCSC takes the form of a sequential Recurrent Auto-Associative Memory, trained through standard back-propagation. Results illustrate the feasibility of this approach — this system learns to deal with highdimensional visual observations (up to 640 pixels) in partially observable environments where there are long time lags (up to 12 steps) between relevant sensory information and necessary action.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reinforcement Learning and Design of Nonparametric Sequential Decision Networks

In this paper we discuss the design of sequential detection networks for nonparametric sequential analysis. We present a general probabilistic model for sequential detection problems where the sample size as well as the statistics of the sample can be varied. A general sequential detection network handles three decisions. First, the network decides whether to continue sampling or stop and make ...

متن کامل

A novel genetic reinforcement learning for nonlinear fuzzy control problems

Unlike a supervise learning, a reinforcement learning problem has only very simple ‘‘evaluative’’ or ‘‘critic’’ information available for learning, rather than ‘‘instructive’’ information. A novel genetic reinforcement learning, called reinforcement sequential-search-based genetic algorithm (R-SSGA), is proposed for solving the nonlinear fuzzy control problems in this paper. Unlike the traditio...

متن کامل

Efficient Approximate Policy Iteration Methods for Sequential Decision Making in Reinforcement Learning

(Computer Science—Machine Learning) EFFICIENT APPROXIMATE POLICY ITERATION METHODS FOR SEQUENTIAL DECISION MAKING IN REINFORCEMENT LEARNING

متن کامل

Linear Stochastic Approximation: Constant Step-Size and Iterate Averaging

We consider d-dimensional linear stochastic approximation algorithms (LSAs) with a constant step-size and the so called Polyak-Ruppert (PR) averaging of iterates. LSAs are widely applied in machine learning and reinforcement learning (RL), where the aim is to compute an appropriate θ∗ ∈ R (that is an optimum or a fixed point) using noisy data and O(d) updates per iteration. In this paper, we ar...

متن کامل

Efficient Bayesian Nonparametric Methods for Model-Free Reinforcement Learning in Centralized and Decentralized Sequential Environments

Efficient Bayesian Nonparametric Methods for Model-Free Reinforcement Learning in Centralized and Decentralized Sequential Environments by Miao Liu Department of Electrical and Computer Engineering Duke University

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011