temporal difference learning

نتایج جستجو برای: temporal difference learning

تعداد نتایج: 1222164 فیلتر نتایج به سال:

An Analysis of Experience Replay in Temporal Difference Learning

Journal: :Cybernetics and Systems 1999

Pawel Cichosz

متن کامل

TD Learning of Game Evaluation Functions with Hierarchies of Adaptive Experts

1995

Marco A. Wiering

متن کامل

Speeding up Tabular Reinforcement Learning Using State-Action Similarities

2017

Ariel Rosenfeld Matthew E. Taylor Sarit Kraus

One of the most prominent approaches for speeding up reinforcement learning is injecting human prior knowledge into the learning agent. This paper proposes a novel method to speed up temporal difference learning by using state-action similarities. These handcoded similarities are tested in three well-studied domains of varying complexity, demonstrating our approach’s benefits.

متن کامل

On the asymptotic equivalence between differential Hebbian and temporal difference learning using a local third factor

2008

Christoph Kolodziejski Bernd Porr Minija Tamosiunaite Florentin Wörgötter

In this theoretical contribution we provide mathematical proof that two of the most important classes of network learning correlation-based differential Hebbian learning and reward-based temporal difference learning are asymptotically equivalent when timing the learning with a local modulatory signal. This opens the opportunity to consistently reformulate most of the abstract reinforcement lear...

متن کامل

Feature Construction for Reinforcement Learning in Hearts

2006

Nathan R. Sturtevant Adam M. White

Temporal difference (TD) learning has been used to learn strong evaluation functions in a variety of two-player games. TD-gammon illustrated how the combination of game tree search and learning methods can achieve grand-master level play in backgammon. In this work, we develop a player for the game of hearts, a 4-player game, based on stochastic linear regression and TD learning. Using a small ...

متن کامل

Some Simulation Results for Emphatic Temporal-Difference Learning Algorithms

Journal: :CoRR 2016

Huizhen Yu

This is a companion note to our recent study of the weak convergence properties of constrained emphatic temporal-difference learning (ETD) algorithms from a theoretic perspective. It supplements the latter analysis with simulation results and illustrates the behavior of some of the ETD algorithms using three example problems.

متن کامل

Modeling Timing Structure in Multimedia Signals

2006

Hiroaki Kawashima Kimitaka Tsutsumi Takashi Matsuyama

Modeling and describing temporal structure in multimedia signals, which are captured simultaneously by multiple sensors, is important for realizing human machine interaction and motion generation. This paper proposes a method for modeling temporal structure in multimedia signals based on temporal intervals of primitive signal patterns. Using temporal difference between beginning points and the ...

متن کامل

Average cost temporal-difference learning

Journal: :Automatica 1999

متن کامل

Dual Temporal Difference Learning

2009

Min Yang Yuxi Li Dale Schuurmans

Recently, researchers have investigated novel dual representations as a basis for dynamic programming and reinforcement learning algorithms. Although the convergence properties of classical dynamic programming algorithms have been established for dual representations, temporal difference learning algorithms have not yet been analyzed. In this paper, we study the convergence properties of tempor...

متن کامل

Preconditioned Temporal Difference Learning

2007

Yao HengShuai

LSTD is numerically instable for some ergodic Markov chains with preferred visits among some states over the remaining ones. Because the matrix that LSTD accumulates has large condition numbers. In this paper, we propose a variant of temporal difference learning with high data efficiency. A class of preconditioned temporal difference learning algorithms are also proposed to speed up the new met...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید