temporal difference learning

نتایج جستجو برای: temporal difference learning

تعداد نتایج: 1222164 فیلتر نتایج به سال:

Least-Squares Temporal Difference Learning

1999

Justin A. Boyan

Excerpted from: Boyan, Justin. Learning Evaluation Functions for Global Optimization. Ph.D. thesis, Carnegie Mellon University, August 1998. (Available as Technical Report CMU-CS-98-152.) TD( ) is a popular family of algorithms for approximate policy evaluation in large MDPs. TD( ) works by incrementally updating the value function after each observed transition. It has two major drawbacks: it ...

متن کامل

Learning Rates: Evolution versus Temporal Difference Learning

2008

Simon M. Lucas

Evidently, any learning algorithm can only learn on the basis of the information given to it. This paper presents an initial attempt to place an upper bound on the information rates attainable with standard co-evolution and with TDL. The upper bound for TDL is shown to be much higher than for evolution. To test how well these bounds correlate with actual learning, a simple two-player game calle...

متن کامل

Altered Temporal Difference Learning in Bulimia Nervosa

Journal: :Biological Psychiatry 2011

متن کامل

Temporal-Difference Reinforcement Learning with Distributed Representations

Journal: :PLoS ONE 2009

متن کامل

A complementary learning systems approach to temporal difference learning

Journal: :Neural Networks 2020

متن کامل

A temporal difference account of avoidance learning

Journal: :Network: Computation in Neural Systems 2008

متن کامل

Learning from Demonstrations for Real World Reinforcement Learning

Journal: :CoRR 2017

Todd Hester Matej Vecerik Olivier Pietquin Marc Lanctot Tom Schaul Bilal Piot Andrew Sendonaris Gabriel Dulac-Arnold Ian Osband John Agapiou Joel Z. Leibo Audrunas Gruslys

Deep reinforcement learning (RL) has achieved several high profile successes in difficult decision-making problems. However, these algorithms typically require a huge amount of data before they reach reasonable performance. In fact, their performance during learning can be extremely poor. This may be acceptable for a simulator, but it severely limits the applicability of deep RL to many real-wo...

متن کامل

Temporal Difference Bayesian Model Averaging: A Bayesian Perspective on Adapting Lambda

2010

Carlton Downey Scott Sanner

Temporal difference (TD) algorithms are attractive for reinforcement learning due to their ease-of-implementation and use of “bootstrapped” return estimates to make efficient use of sampled data. In particular, TD(λ) methods comprise a family of reinforcement learning algorithms that often yield fast convergence by averaging multiple estimators of the expected return. However, TD(λ) chooses a v...

متن کامل

Generalized TD Learning

Journal: :Journal of Machine Learning Research 2011

Tsuyoshi Ueno Shin-ichi Maeda Motoaki Kawanabe Shin Ishii

Since the invention of temporal difference (TD) learning (Sutton, 1988), many new algorithms for model-free policy evaluation have been proposed. Although they have brought much progress in practical applications of reinforcement learning (RL), there still remain fundamental problems concerning statistical properties of the value function estimation. To solve these problems, we introduce a new ...

متن کامل

Temporal-Difference Networks for Dynamical Systems with Continuous Observations and Actions

2009

Christopher M. Vigorito

Temporal-difference (TD) networks are a class of predictive state representations that use well-established TD methods to learn models of partially observable dynamical systems. Previous research with TD networks has dealt only with dynamical systems with finite sets of observations and actions. We present an algorithm for learning TD network representations of dynamical systems with continuous...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید