On using discretized Cohen-Grossberg node dynamics for model-free actor-critic neural learning in non-Markovian domains

نویسندگان

Eiji Mizutani

Stuart E. Dreyfus

چکیده

We describe how multi-stage non-Markovian decision problems can be solved using actor-critic reinforcement learning by assuming that a discrete version of CohenGrossberg node dynamics describes the node-activation computations of a neural network (NN). Our NN (i.e., agent) is capable of rendering the process Markovian implicitly and automatically in a totally model-free fashion without learning by how much the state space must be augmented so that the Markov property holds. This serves as an alternative to using Elman or Jordantype recurrent neural networks, whose context units function as a history memory in order to develop sensitivity to non-Markovian dependencies. We shall demonstrate our concept using a small-scale non-Markovian deterministic path problem, in which our actor-critic NN finds an optimal sequence of actions (but learns neither transitional dynamics nor associated rewards), although it needs many iterations due to the nature of neural model-free learning. This is, in spirit, a neurodynamic programming approach.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Totally Model-Free Reinforcement Learning by Actor-Critic Elman Networks in Non-Markovian Domains

In this paper we describe how an actor critic rein forcement learning agent in a non Markovian domain nds an optimal sequence of actions in a totally model free fashion that is the agent neither learns transitional probabilities and associated rewards nor by how much the state space should be augmented so that the Markov prop erty holds In particular we employ an Elman type re current neural ne...

متن کامل

Robust stability of fuzzy Markov type Cohen-Grossberg neural networks by delay decomposition approach

In this paper, we investigate the delay-dependent robust stability of fuzzy Cohen-Grossberg neural networks with Markovian jumping parameter and mixed time varying delays by delay decomposition method. A new Lyapunov-Krasovskii functional (LKF) is constructed by nonuniformly dividing discrete delay interval into multiple subinterval, and choosing proper functionals with different weighting matr...

متن کامل

Learning to Play Donkey Kong Using Neural Networks and Reinforcement Learning

Neural networks and reinforcement learning have successfully been applied to various games, such as Ms. Pacman and Go. We combine multilayer perceptrons and a class of reinforcement learning algorithms known as actor-critic to learn to play the arcade classic Donkey Kong. Two neural networks are used in this study: the actor and the critic. The actor learns to select the best action given the g...

متن کامل

Online Learning of Optimal Control Solutions Using Integral Reinforcement Learning and Neural Networks

In this paper we introduce an online algorithm that uses integral reinforcement knowledge for learning the continuous-time optimal control solution for nonlinear systems with infinite horizon costs and partial knowledge of the system dynamics. This algorithm is a data based approach to the solution of the Hamilton-Jacobi-Bellman equation and it does not require explicit knowledge on the system’...

متن کامل

A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems

An online adaptive reinforcement learning-based solution is developed for the infinite-horizon optimal control problem for continuous-time uncertain nonlinear systems. A novel actor–critic–identifier (ACI) is proposed to approximate the Hamilton–Jacobi–Bellman equation using three neural network (NN) structures—actor and critic NNs approximate the optimal control and the optimal value function,...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2003

On using discretized Cohen-Grossberg node dynamics for model-free actor-critic neural learning in non-Markovian domains

نویسندگان

چکیده

منابع مشابه

Totally Model-Free Reinforcement Learning by Actor-Critic Elman Networks in Non-Markovian Domains

Robust stability of fuzzy Markov type Cohen-Grossberg neural networks by delay decomposition approach

Learning to Play Donkey Kong Using Neural Networks and Reinforcement Learning

Online Learning of Optimal Control Solutions Using Integral Reinforcement Learning and Neural Networks

A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems

عنوان ژورنال:

اشتراک گذاری