نتایج جستجو برای: temporal difference learning

تعداد نتایج: 1222164  

2009
Shivaram Kalyanakrishnan Peter Stone

As machine learning is applied to increasingly complex tasks, it is likely that the diverse challenges encountered can only be addressed by combining the strengths of different learning algorithms. We examine this aspect of learning through a case study grounded in the robot soccer context. The task we consider is Keepaway, a popular benchmark for multiagent reinforcement learning from the simu...

2007
Marcus Hutter Shane Legg

In the field of reinforcement learning, temporal difference (TD) learning is perhaps the most popular way to estimate the future discounted reward of states. We derive an equation for TD learning from statistical principles. Specifically, we start with the variational principle and then bootstrap to produce an updating rule for discounted state value estimates. The resulting equation is similar...

2014
Joost Broekens Tim Baarslag

Understanding the affective, cognitive and behavioural processes involved in risk taking is essential for treatment and for setting environmental conditions to limit damage. Using Temporal Difference Reinforcement Learning (TDRL) we computationally investigated the effect of optimism in risk perception in a variety of goal-oriented tasks. Optimism in risk perception was studied by varying the c...

Journal: :Eng. Appl. of AI 2013
Simon Box Ben Waterson

This paper shows how temporal difference learning can be used to build a signalized junction controller that will learn its own strategies though experience. Simulation tests detailed here show that the learned strategies can have high performance. This work builds upon previous work where a neural network based junction controller that can learn strategies from a human expert was developed (Bo...

2010
Matt Dilts Hector Muñoz-Avila

In this paper we present an approach for reducing the memory footprint requirement of temporal difference methods in which the set of states is finite. We use case-based generalization to group the states visited during the reinforcement learning process. We follow a lazy learning approach; cases are grouped in the order in which they are visited. Any new state visited is assigned to an existin...

2005
Sattiraju V. Prabhakar

Learners such as humans and intelligent agents, often require support to perform complex tasks in real world environments. Inaccessible states and the complexity of tasks add to this requirement. Large state and action spaces associated with the environments also contribute to this need for support. In this paper, we present a design of a learning agent that learns from the environment in order...

Journal: :CoRR 2018
Vitchyr Pong Shixiang Gu Murtaza Dalal Sergey Levine

Model-free reinforcement learning (RL) is a powerful, general tool for learning complex behaviors. However, its sample efficiency is often impractically large for solving challenging real-world problems, even with off-policy algorithms such as Q-learning. A limiting factor in classic model-free RL is that the learning signal consists only of scalar rewards, ignoring much of the rich information...

2002
Nathaniel Scott Winstead

Historically, the accepted approach to control problems in physically complicated domains has been through machine learning, due to the fact that knowledge engineering in these domains can be extremely complicated. When the already physically complicated domain is also continuous and dynamical (possibly with composite and/or sequential goals), the learning task becomes even more difficult due t...

2008
Houcine Romdhane Luc Lamontagne

In the paper, we investigate the use of reinforcement learning in CBR for estimating and managing a legacy case base for playing the game of Tetris. Each case corresponds to a local pattern describing the relative height of a subset of columns where pieces could be placed. We evaluate these patterns through reinforcement learning to determine if significant performance improvement can be observ...

2007
Abdelkarim Souissi Hacene Rezine

In this article, we are interested in the reactive behaviours navigation training of a mobile robot in an unknown environment. The method we will suggest ensures navigation in unknown environments with presence off different obstacles shape and consists in bringing the robot in a goal position, avoiding obstacles and releasing it from the tight corners and deadlock obstacles shape. In this fram...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید