Using Temporal Neighborhoods to Adapt Function Approximators in Reinforcement Learning

نویسندگان

  • R. Matthew Kretchmar
  • Charles W. Anderson
چکیده

To avoid the curse of dimensionality, function approximators are used in reinforcement learning to learn value functions for individual states. In order to make better use of computational resources (basis functions) many researchers are investigating ways to adapt the basis functions during the learning process so that they better t the value-function landscape. Here we introduce temporal neighborhoods as small groups of states that experience frequent intragroup transitions during on-line sampling. We then form basis functions along these temporal neighborhoods. Empirical evidence is provided which demonstrates the e ectiveness of this scheme. We discuss a class of RL problems for which this method might be plausible.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A NEAT Way for Evolving Echo State Networks

The Reinforcement Learning (RL) paradigm is an appropriate formulation for agent, goal-directed, sequential decision making. In order though for RL methods to perform well in difficult, complex, real-world tasks, the choice and the architecture of an appropriate function approximator is of crucial importance. This work presents a method of automatically discovering such function approximators, ...

متن کامل

Evolutionary Function Approximation for Reinforcement Learning

Temporal difference methods are theoretically grounded and empirically effective methods for addressing reinforcement learning problems. In most real-world reinforcement learning tasks, TD methods require a function approximator to represent the value function. However, using function approximators requires manually making crucial representational decisions. This paper investigates evolutionary...

متن کامل

Transfer Learning via Inter-Task Mappings for Temporal Difference Learning

Temporal difference (TD) learning (Sutton and Barto, 1998) has become a popular reinforcement learning technique in recent years. TD methods, relying on function approximators to generalize learning to novel situations, have had some experimental successes and have been shown to exhibit some desirable properties in theory, but the most basic algorithms have often been found slow in practice. Th...

متن کامل

High-accuracy value-function approximation with neural networks applied to the acrobot

Several reinforcement-learning techniques have already been applied to the Acrobot control problem, using linear function approximators to estimate the value function. In this paper, we present experimental results obtained by using a feedforward neural network instead. The learning algorithm used was model-based continuous TD(λ). It generated an efficient controller, producing a high-accuracy ...

متن کامل

Empirical Comparison of Gradient Descent andExponentiated Gradient Descent in

This report describes a series of results using the exponentiated gradient descent (EG) method recently proposed by Kivinen and Warmuth. Prior work is extended by comparing speed of learning on a nonstationary problem and on an extension to backpropagation networks. Most signi cantly, we present an extension of the EG method to temporal-di erence and reinforcement learning. This extension is co...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999