نتایج جستجو برای: keywords reinforcement learning
تعداد نتایج: 2453256 فیلتر نتایج به سال:
See the abstract for Chapter C3. Delayed reinforcement learning (RL) concerns the solution of stochastic optimal control problems. In this section we formulate and discuss the basics of such problems. Solution methods for delayed RL will be presented in Sections C3.4 and C3.5. In these three sections we will mainly consider problems in which C3.4, C3.5 the state and control spaces are finite se...
In this paper, we address an under-represented class of learning algorithms in the study of connectionism: reinforcement learning. We first introduce these classic methods in a new formalism which highlights the particularities of implementations such as Q-Learning, QLearning with Hamming distance, Q-Learning with statistical clustering and Dyna-Q. We then present in this formalism a neural imp...
In reinforcement learning for multi-step problems, the sparse nature of the feedback aggravates the difficulty of learning to perform. This paper explores the use of a reinforcement learning architecture, leading to a discussion of reinforcement learning in terms of feature abstraction, credit-assignment, and temporal-difference learning. Issues discussed include: the conditioning of the reinfo...
Reinforcement learning is an approach for learning optimal action policy via experiencing, i.e. using observed reward in environment states. Reinforcement learning algorithms include adaptive dynamic programming, temporal difference learning and Q-learning[1]. Examples of successful applications of reinforcement learning are controller for sustained inverted flight on an autonomous helicopter [...
abstract the first purpose of this study was to investigate the effect of consciousness-raising (c-r) activities on learning grammatical structures (simple present tense in this case) by iranian guidance school efl learners. the second one was to investigate the effect of gender on learning the simple present tense through c-r activities and tasks. finally, this study aimed to investigate the ...
The methods of temporal differences (Samuel, 1959; Sutton, 1984, 1988) allow an agent to learn accurate predictions of stationary stochastic future outcomes. The learning is effectively stochastic approximation based on samples extracted from the process generating the agent's future. Sutton (1988) proved that for a special case of temporal differences, the expected values of the predictions co...
Least-squares temporal difference learning (LSTD) has been used mainly for improving the data efficiency of the critic in actor-critic (AC). However, convergence analysis of the resulted algorithms is difficult when policy is changing. In this paper, a new AC method is proposed based on LSTD under discount criterion. The method comprises two components as the contribution: (1) LSTD works in an ...
Distributed Coverage Control by Robot Networks in Unknown Environments Using a Modified EM Algorithm
In this paper, we study a distributed control algorithm for the problem of unknown area coverage by a network of robots. The coverage objective is to locate a set of targets in the area and to minimize the robots’ energy consumption. The robots have no prior knowledge about the location and also about the number of the targets in the area. One efficient approach that can be used to relax the ro...
The goal of research was the effect of electronical learning media on the reinforcement of youth social behavior from the point of view of computer course professors and students of Islamic Azad University of Sari. The statistical population was included of all computer students and professors of I.A.U of Sari. The statistical sample was identified by using of the sample content identification ...
In this paper, we discuss situations arising with reinforcement learning algorithms, when the reinforcement is delayed. The decision to consider delayed reinforcement is typical in many applications, and we discuss some motivations for it. Then, we summarize Q-Learning, a popular algorithm to deal with delayed reinforcement, and its recent extensions to use it to learn fuzzy logic structures (F...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید