temporal difference learning

نتایج جستجو برای: temporal difference learning

تعداد نتایج: 1222164 فیلتر نتایج به سال:

Focused Crawling Using Temporal Difference-Learning

2004

Alexandros M. Grigoriadis Georgios Paliouras

This paper deals with the problem of constructing an intelligent Focused Crawler, i.e. a system that is able to retrieve documents of a specific topic from the Web. The crawler must contain a component which assigns visiting priorities to the links, by estimating the probability of leading to a relevant page in the future. Reinforcement Learning was chosen as a method that fits this task nicely...

متن کامل

A temporal difference account of avoidance learning.

Journal: :Network 2008

Michael Moutoussis Richard P Bentall Jonathan Williams Peter Dayan

Aversive processing plays a central role in human phobic fears and may also be important in some symptoms of psychosis. We developed a temporal-difference model of the conditioned avoidance response, an important experimental model for aversive learning which is also a central pharmacological model of psychosis. In the model, dopamine neurons reported outcomes that were better than the learner ...

متن کامل

An Introduction to Temporal Difference Learning

2013

Florian Kunz

Temporal Difference learning is one of the most used approaches for policy evaluation. It is a central part of solving reinforcement learning tasks. For deriving optimal control, policies have to be evaluated. This task requires value function approximation. At this point TD methods find application. The use of eligibility traces for backpropagation of updates as well as the bootstrapping of th...

متن کامل

Temporal-Difference Reinforcement Learning with Distributed Representations

2009

Zeb Kurth-Nelson A. David Redish

Temporal-difference (TD) algorithms have been proposed as models of reinforcement learning (RL). We examine two issues of distributed representation in these TD algorithms: distributed representations of belief and distributed discounting factors. Distributed representation of belief allows the believed state of the world to distribute across sets of equivalent states. Distributed exponential d...

متن کامل

Temporal difference learning applied to sequential detection

Journal: :IEEE transactions on neural networks 1997

Chengan Guo Anthony Kuh

This paper proposes a novel neural-network method for sequential detection, We first examine the optimal parametric sequential probability ratio test (SPRT) and make a simple equivalent transformation of the SPRT that makes it suitable for neural-network architectures. We then discuss how neural networks can learn the SPRT decision functions from observation data and labels. Conventional superv...

متن کامل

Prospective and retrospective temporal difference learning.

Journal: :Network 2009

Peter Dayan

A striking recent finding is that monkeys behave maladaptively in a class of tasks in which they know that reward is going to be systematically delayed. This may be explained by a malign Pavlovian influence arising from states with low predicted values. However, by very carefully analyzing behavioral data from such tasks, La Camera and Richmond (2008) observed the additional important character...

متن کامل

Efficient Asymptotic Approximation in Temporal Difference Learning

2000

Frédérick Garcia Florent Serre

in Temporal Difference Learning Frédérick Garcia and Florent Serre Abstract. TD( ) is an algorithm that learns the value function associated to a policy in a Markov Decision Process (MDP). We propose in this paper an asymptotic approximation of online TD( ) with accumulating eligibility trace, called ATD( ). We then use the Ordinary Differential Equation (ODE) method to analyse ATD( ) and to op...

متن کامل

Proximal Gradient Temporal Difference Learning Algorithms

2016

Bo Liu Ji Liu Mohammad Ghavamzadeh Sridhar Mahadevan Marek Petrik

In this paper, we describe proximal gradient temporal difference learning, which provides a principled way for designing and analyzing true stochastic gradient temporal difference learning algorithms. We show how gradient TD (GTD) reinforcement learning methods can be formally derived, not with respect to their original objective functions as previously attempted, but rather with respect to pri...

متن کامل

On Convergence of Emphatic Temporal-Difference Learning

2015

H. Yu

We consider emphatic temporal-difference learning algorithms for policy evaluation in discounted Markov decision processes with finite spaces. Such algorithms were recently proposed by Sutton, Mahmood, and White (2015) as an improved solution to the problem of divergence of off-policy temporal-difference learning with linear function approximation. We present in this paper the first convergence...

متن کامل

Incremental Least-Squares Temporal Difference Learning

2006

Alborz Geramifard Michael H. Bowling Richard S. Sutton

Approximate policy evaluation with linear function approximation is a commonly arising problem in reinforcement learning, usually solved using temporal difference (TD) algorithms. In this paper we introduce a new variant of linear TD learning, called incremental least-squares TD learning, or iLSTD. This method is more data efficient than conventional TD algorithms such as TD(0) and is more comp...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید