نتایج جستجو برای: q learning

تعداد نتایج: 717428  

2005
Lucian Buşoniu Bart De Schutter Robert Babuška

In realistic multiagent systems, learning on the basis of complete state information is not feasible. We introduce adaptive state focus Q-learning, a class of methods derived from Q-learning that start learning with only the state information that is strictly necessary for a single agent to perform the task, and that monitor the convergence of learning. If lack of convergence is detected, the l...

2002
James L. Carroll

We survey various transfer methods in Q-learning, a type of reinforcement learning, and present a variation on fixed sub-transfer which we call dynamic sub-transfer. We describe the pros and cons of dynamic sub-transfer as compared with the other transfer methods, and we describe qualitatively the situations where this method would be preferred over the fixed version of sub-transfer.

2009
Seydina M. Ndiaye

E : Les probl emes de d ecision pos es par l'optimisation stochastique en horizon ni en l'absence de mod ele peuvent ^ etre trait es par des m ethodes adaptatives. Dii erents algorithmes d'apprentissage par renforcement ont et e propos es, tels le Q-Learning ou le R-Learning, mais ils sont d eenis pour des probl emes a horizon innni. On propose ici une mod elisation en horizon ni avec une compa...

2003
R. C. Arkin Y. Endo B. Lee E. Martinson

This article describes three different methods for introducing machine learning into a hybrid deliberative/reactive architecture for multirobot systems: learning momentum, Q-learning, and CBR wizards. A range of simulation experiments and results are reported using the Georgia Tech MissionLab mission specification system.

2000
Sachiyo Arai Katia P. Sycara Terry R. Payne

In this paper, we discuss Pro t-sharing, an experience-based reinforcement learning approach (which is similar to a Monte-Carlo based reinforcement learning method) that can be used to learn robust and e ective actions within uncertain, dynamic, multi-agent systems. We introduce the cut-loop routine that discards looping behavior, and demonstrate its e ectiveness empirically within a simpli ed ...

1996
Sridhar Mahadevan

Research in reinforcement learning (RL) has thus far concentrated on two optimality criteria: the discounted framework, which has been very well-studied, and the average-reward framework, in which interest is rapidly increasing. In this paper, we present a framework called sensitive discount optimality which ooers an elegant way of linking these two paradigms. Although sensitive discount optima...

Journal: :علوم کاربردی و محاسباتی در مکانیک 0
هادی کلانی علیرضا اکبرزاده توتونچی

this article presents an implementation of a reinforcement learning (rl) method for a snake like robot navigation. the paper starts with developing kinematics and dynamics model of a snake robot in serpentine locomotion followed by performing simulation and finishes with actual experimentation. first, gibbs-appell's method is used to obtain the robot dynamics. the robot is also modeled in simme...

1999
Chris Gaskett David Wettergreen Alexander Zelinsky

Q-learning can be used to learn a control policy that maximises a scalar reward through interaction with the environment. Qlearning is commonly applied to problems with discrete states and actions. We describe a method suitable for control tasks which require continuous actions, in response to continuous states. The system consists of a neural network coupled with a novel interpolator. Simulati...

1998
Richard Dearden Nir Friedman Stuart J. Russell

A central problem in learning in complex environments is balancing exploration of untested actions against exploitation of actions that are known to be good. The benefit of exploration can be estimated using the classical notion of Value of Information—the expected improvement in future decision quality that might arise from the information acquired by exploration. Estimating this quantity requ...

2016
Ryosuke Shibusawa Tomoaki Otsuka Toshiharu Sugawara

This paper proposes a behavioral strategy called expectation of cooperation strategy with which cooperation in the prisoner’s dilemma game emerges in agent networks by incorporating Q-learning. The proposed strategy is simple and easy to implement but nevertheless can evolve and maintain cooperation in all agent networks under certain conditions. We conducted a number of experiments to clarify ...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید