نتایج جستجو برای: temporal difference learning

تعداد نتایج: 1222164  

Journal: :Cybernetics and Systems 1999
Pawel Cichosz

2017
Ariel Rosenfeld Matthew E. Taylor Sarit Kraus

One of the most prominent approaches for speeding up reinforcement learning is injecting human prior knowledge into the learning agent. This paper proposes a novel method to speed up temporal difference learning by using state-action similarities. These handcoded similarities are tested in three well-studied domains of varying complexity, demonstrating our approach’s benefits.

2008
Christoph Kolodziejski Bernd Porr Minija Tamosiunaite Florentin Wörgötter

In this theoretical contribution we provide mathematical proof that two of the most important classes of network learning correlation-based differential Hebbian learning and reward-based temporal difference learning are asymptotically equivalent when timing the learning with a local modulatory signal. This opens the opportunity to consistently reformulate most of the abstract reinforcement lear...

2006
Nathan R. Sturtevant Adam M. White

Temporal difference (TD) learning has been used to learn strong evaluation functions in a variety of two-player games. TD-gammon illustrated how the combination of game tree search and learning methods can achieve grand-master level play in backgammon. In this work, we develop a player for the game of hearts, a 4-player game, based on stochastic linear regression and TD learning. Using a small ...

Journal: :CoRR 2016
Huizhen Yu

This is a companion note to our recent study of the weak convergence properties of constrained emphatic temporal-difference learning (ETD) algorithms from a theoretic perspective. It supplements the latter analysis with simulation results and illustrates the behavior of some of the ETD algorithms using three example problems.

2006
Hiroaki Kawashima Kimitaka Tsutsumi Takashi Matsuyama

Modeling and describing temporal structure in multimedia signals, which are captured simultaneously by multiple sensors, is important for realizing human machine interaction and motion generation. This paper proposes a method for modeling temporal structure in multimedia signals based on temporal intervals of primitive signal patterns. Using temporal difference between beginning points and the ...

2009
Min Yang Yuxi Li Dale Schuurmans

Recently, researchers have investigated novel dual representations as a basis for dynamic programming and reinforcement learning algorithms. Although the convergence properties of classical dynamic programming algorithms have been established for dual representations, temporal difference learning algorithms have not yet been analyzed. In this paper, we study the convergence properties of tempor...

2007
Yao HengShuai

LSTD is numerically instable for some ergodic Markov chains with preferred visits among some states over the remaining ones. Because the matrix that LSTD accumulates has large condition numbers. In this paper, we propose a variant of temporal difference learning with high data efficiency. A class of preconditioned temporal difference learning algorithms are also proposed to speed up the new met...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید