temporal difference learning

نتایج جستجو برای: temporal difference learning

تعداد نتایج: 1222164 فیلتر نتایج به سال:

Mobile Agent Control in Intelligent Space using Reinforcement Learning

2006

László Jeni Zoltán Istenes Péter Korondi Hideki Hashimoto

Finding the safest shortest path in an unknown environment is a fundamental task in mobile robotics. To emulate the human adaptibility in this field, we can use the Intelligent Space concept. The Intelligent Space is a distributed sensory system, which is the background infrastructure to observe human walking in a limited area. The observation of human beings is applied to create a walkable are...

متن کامل

Concurrent Reinforcement Learning from Customer Interactions

2013

David Silver Leonard Newnham David Barker Suzanne Weller Jason McFall

In this paper, we explore applications in which a company interacts concurrently with many customers. The company has an objective function, such as maximising revenue, customer satisfaction, or customer loyalty, which depends primarily on the sequence of interactions between company and customer. A key aspect of this setting is that interactions with different customers occur in parallel. As a...

متن کامل

A Meta-learning Method Based on Temporal Difference Error

2009

Kunikazu Kobayashi Hiroyuki Mizoue Takashi Kuremoto Masanao Obayashi

In general, meta-parameters in a reinforcement learning system, such as a learning rate and a discount rate, are empirically determined and fixed during learning. When an external environment is therefore changed, the sytem cannot adapt itself to the variation. Meanwhile, it is suggested that the biological brain might conduct reinforcement learning and adapt itself to the external environment ...

متن کامل

Gradient Temporal Difference Networks

2012

David Silver

Temporal-difference (TD) networks (Sutton and Tanner, 2004) are a predictive representation of state in which each node is an answer to a question about future observations or questions. Unfortunately, existing algorithms for learning TD networks are known to diverge, even in very simple problems. In this paper we present the first sound learning rule for TD networks. Our approach is to develop...

متن کامل

Chess Neighborhoods, Function Combination, and Reinforcement Learning

2000

Robert Levinson Ryan Weber

Over the years, various research projects have attempted to develop a chess program that learns to play well given little prior knowledge beyond the rules of the game. Early on it was recognized that the key would be to adequately represent the relationships between the pieces and to evaluate the strengths or weaknesses of such relationships. As such, representations have developed, including a...

متن کامل

Lateral Inhibition Overcomes Limits of Temporal Difference Learning

2015

Jacob Rafati David C. Noelle

There is growing support for Temporal Difference (TD) Learning as a formal account of the role of the midbrain dopamine system and the basal ganglia in learning from reinforcement. This account is challenged, however, by the fact that realistic implementations of TD Learning have been shown to fail on some fairly simple learning tasks — tasks well within the capabilities of humans and non-human...

متن کامل

Reinforcement learning of dynamic motor sequence: learning to stand up

1998

Jun Morimoto Kenji Doya

I n this paper, we propose a learning method f o r implementing human-like sequential movements in robots. As an example of dynamic sequential movement, we consider the “stand-up” task f o r a two-joint, three-link robot. In contrast t o the case of steady walking or standing, the desired trajectory fo r such a transient behavior is very dificult t o derive. The goal of the task is to find a pa...

متن کامل

Biological Models of Reinforcement Learning

Journal: :KI 2009

Julien Vitay Jérémy Fix Fred Henrik Hamker Henning Schroll Frederik Beuth

This review focuses on biological issues of reinforcement learning. Since the influential discovery of W. Schultz of an analogy between the reward prediction error signal of the temporal difference algorithm and the firing pattern of some dopaminergic neurons in the midbrain during classical conditioning, biological models have emerged that use computational reinforcement learning concepts to e...

متن کامل

Learning Through Interaction

2010

Violeta Mirchevska Boštjan Kaluža

Reinforcement learning is an approach for learning optimal action policy via experiencing, i.e. using observed reward in environment states. Reinforcement learning algorithms include adaptive dynamic programming, temporal difference learning and Q-learning[1]. Examples of successful applications of reinforcement learning are controller for sustained inverted flight on an autonomous helicopter [...

متن کامل

Transfer Learning for Policy Search Methods

2006

Matthew E. Taylor Shimon Whiteson Peter Stone

An ambitious goal of transfer learning is to learn a task faster after training on a different, but related, task. In this paper we extend a previously successful temporal difference (Sutton & Barto, 1998) approach to transfer in reinforcement learning (Sutton & Barto, 1998) tasks to work with policy search. In particular, we show how to construct a mapping to translate a population of policies...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید