نتایج جستجو برای: q learning

تعداد نتایج: 717428  

2008
Patrick Ulam Joshua Jones Ashok K. Goel

Human experience with interactive games will be enhanced if the game-playing software agents learn from their failures and do not make the same mistakes over and over again. Reinforcement learning, e.g., Q-Learning, provides one method for learning from failures. Model-based meta-reasoning that uses an agent’s self-model for blame assignment provides another. In this paper, we combine the two m...

Journal: :Journal of Machine Learning Research 2003
Junling Hu Michael P. Wellman

We extend Q-learning to a noncooperative multiagent context, using the framework of generalsum stochastic games. A learning agent maintains Q-functions over joint actions, and performs updates based on assuming Nash equilibrium behavior over the current Q-values. This learning protocol provably converges given certain restrictions on the stage games (defined by Q-values) that arise during learn...

2013
Eric Sodomka Elizabeth Hilliard Michael L. Littman Amy Greenwald

Coco (“cooperative/competitive”) values are a solution concept for two-player normalform games with transferable utility, when binding agreements and side payments between players are possible. In this paper, we show that coco values can also be defined for stochastic games and can be learned using a simple variant of Q-learning that is provably convergent. We provide a set of examples showing ...

2011
Alejandro Agostini Enric Celaya

In this work we propose an approach for generalization in continuous domain Reinforcement Learning that, instead of using a single function approximator, tries many different function approximators in parallel, each one defined in a different region of the domain. Associated with each approximator is a relevance function that locally quantifies the quality of its approximation, so that, at each...

Journal: :JCS 2014
Ahmed Soua Hossam Afifi

Efficient propagation of information over a vehicular wireless network has usually remained the focus of the research community. Although, scanty contributions have been made in the field of vehicular data collection and more especially in applying learning techniques to such a very changing networking scheme. These smart learning approaches excel in making the collecting operation more reactiv...

2016
Yen-Chen Wu Tzu-Hsiang Lin Yang-De Chen Hung-yi Lee Lin-Shan Lee

User-machine interaction is important for spoken content retrieval. For text content retrieval, the user can easily scan through and select on a list of retrieved item. This is impossible for spoken content retrieval, because the retrieved items are difficult to show on screen. Besides, due to the high degree of uncertainty for speech recognition, the retrieval results can be very noisy. One wa...

2002
Poj Tangamchit John M. Dolan Pradeep K. Khosla

Learning can be an effective way for robot systems to deal with dynamic environments and changing task conditions. However, popular singlerobot learning algorithms based on discounted rewards, such as Q learning, do not achieve cooperation (i.e., purposeful division of labor) when applied to task-level multirobot systems. A tasklevel system is defined as one performing a mission that is decompo...

2003
Shotaro Kamio Hideyuki Mitsuhasi Hitoshi Iba

We propose an integrated technique of genetic programming (GP) and reinforcement learning (RL) that allows a real robot to execute real-time learning. Our technique does not need a precise simulator because learning is done with a real robot. Moreover, our technique makes it possible to learn optimal actions in real robots. We show the result of an experiment with a real robot AIBO and represen...

2008
Christian Balkenius Stefan Winberg

A reinforcement architecture is introduced that consists of three complementary learning systems with different generalization abilities. The ACTOR learns state-action associations, the CRITIC learns a goal-gradient, and the PUNISH system learns what actions to avoid. The architecture is compared to the standard actor-crititc and Q-learning models on a number of maze learning tasks. The novel a...

Journal: :Expert Syst. Appl. 2011
Jing Li Zhaohan Sheng Kwan-Chew Ng

This paper studies a multi-goal Q-learning algorithm of cooperative teams. Member of the cooperative teams is simulated by an agent. In the virtual cooperative team, agents adapt its knowledge according to cooperative principles. The multi-goal Q-learning algorithm is approached to the multiple learning goals. In the virtual team, agents learn what knowledge to adopt and how much to learn (choo...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید