نتایج جستجو برای: keywords reinforcement learning

تعداد نتایج: 2453256  

1996
S Sathiya Keerthi B Ravindran

See the abstract for Chapter C3. Delayed reinforcement learning (RL) concerns the solution of stochastic optimal control problems. In this section we formulate and discuss the basics of such problems. Solution methods for delayed RL will be presented in Sections C3.4 and C3.5. In these three sections we will mainly consider problems in which C3.4, C3.5 the state and control spaces are finite se...

1994
Samira Sehad Claude F. Touzet

In this paper, we address an under-represented class of learning algorithms in the study of connectionism: reinforcement learning. We first introduce these classic methods in a new formalism which highlights the particularities of implementations such as Q-Learning, QLearning with Hamming distance, Q-Learning with statistical clustering and Dyna-Q. We then present in this formalism a neural imp...

1992
David J. Finton Yu Hen Hu

In reinforcement learning for multi-step problems, the sparse nature of the feedback aggravates the difficulty of learning to perform. This paper explores the use of a reinforcement learning architecture, leading to a discussion of reinforcement learning in terms of feature abstraction, credit-assignment, and temporal-difference learning. Issues discussed include: the conditioning of the reinfo...

2010
Violeta Mirchevska Boštjan Kaluža

Reinforcement learning is an approach for learning optimal action policy via experiencing, i.e. using observed reward in environment states. Reinforcement learning algorithms include adaptive dynamic programming, temporal difference learning and Q-learning[1]. Examples of successful applications of reinforcement learning are controller for sustained inverted flight on an autonomous helicopter [...

پایان نامه :وزارت علوم، تحقیقات و فناوری - دانشگاه سیستان و بلوچستان - دانشکده ادبیات و علوم انسانی 1392

abstract the first purpose of this study was to investigate the effect of consciousness-raising (c-r) activities on learning grammatical structures (simple present tense in this case) by iranian guidance school efl learners. the second one was to investigate the effect of gender on learning the simple present tense through c-r activities and tasks. finally, this study aimed to investigate the ...

1994
Richard Sutton

The methods of temporal differences (Samuel, 1959; Sutton, 1984, 1988) allow an agent to learn accurate predictions of stationary stochastic future outcomes. The learning is effectively stochastic approximation based on samples extracted from the process generating the agent's future. Sutton (1988) proved that for a special case of temporal differences, the expected values of the predictions co...

2017
Luntong Li Dazi Li Tianheng Song

Least-squares temporal difference learning (LSTD) has been used mainly for improving the data efficiency of the critic in actor-critic (AC). However, convergence analysis of the resulted algorithms is difficult when policy is changing. In this paper, a new AC method is proposed based on LSTD under discount criterion. The method comprises two components as the contribution: (1) LSTD works in an ...

2017
Mohammadhosein Hasanbeig Lacra Pavel

In this paper, we study a distributed control algorithm for the problem of unknown area coverage by a network of robots. The coverage objective is to locate a set of targets in the area and to minimize the robots’ energy consumption. The robots have no prior knowledge about the location and also about the number of the targets in the area. One efficient approach that can be used to relax the ro...

Babak Hosseinzadeh, Hamid Fallah Jamal Sadeghi, Zobeydeh JAanbazi

The goal of research was the effect of electronical learning media on the reinforcement of youth social behavior from the point of view of computer course professors and students of Islamic Azad University of Sari. The statistical population was included of all computer students and professors of I.A.U of Sari. The statistical sample was identified by using of the sample content identification ...

1996
Andrea Bonarini

In this paper, we discuss situations arising with reinforcement learning algorithms, when the reinforcement is delayed. The decision to consider delayed reinforcement is typical in many applications, and we discuss some motivations for it. Then, we summarize Q-Learning, a popular algorithm to deal with delayed reinforcement, and its recent extensions to use it to learn fuzzy logic structures (F...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید