q learning

نتایج جستجو برای: q learning

تعداد نتایج: 717428 فیلتر نتایج به سال:

Neural Machine Learning Approaches: Q-Learning and Complexity Estimation Based Information Processing System

2017

Abdennasser Chebira Abdelhamid Mellouk Kurosh Madani Said Hoceini

متن کامل

Learning mixed behaviours with parallel Q-learning

2002

Guillaume J. Laurent Emmanuel Piat

This paper presents a reinforcement learning algorithm based on a parallel approach of the Watkins’s Q-Learning. This algorithm is used to control a two axis micro-manipulator system. The aim is to learn complex behaviours as reaching target positions and avoiding obstacles at the same time. The simulations and the tests with the real manipulator show that this algorithm is able to learn simult...

متن کامل

Multiagent Reinforcement Learning with Adaptive State Focus

2005

Lucian Busoniu Bart De Schutter Robert Babuska

In realistic multiagent systems, learning on the basis of complete state information is not feasible. We introduce adaptive state focus Q-learning, a class of methods derived from Qlearning that start learning with only the state information that is strictly necessary for a single agent to perform the task, and that monitor the convergence of learning. If lack of convergence is detected, the le...

متن کامل

Reinforcement Learning for Average Reward Zero-Sum Games

2004

Shie Mannor

We consider Reinforcement Learning for average reward zerosum stochastic games. We present and analyze two algorithms. The first is based on relative Q-learning and the second on Q-learning for stochastic shortest path games. Convergence is proved using the ODE (Ordinary Differential Equation) method. We further discuss the case where not all the actions are played by the opponent with comparab...

متن کامل

RETALIATE: Learning Winning Policies in First-Person Shooter Games

2007

Megan Smith Stephen Lee-Urban Hector Muñoz-Avila

In this paper we present RETALIATE, an online reinforcement learning algorithm for developing winning policies in team firstperson shooter games. RETALIATE has three crucial characteristics: (1) individual BOT behavior is fixed although not known in advance, therefore individual BOTS work as “plugins”, (2) RETALIATE models the problem of learning team tactics through a simple state formulation,...

متن کامل

An Intelligent Battery Controller Using Bias-Corrected Q-learning

2012

Donghun Lee Warren B. Powell

The transition to renewables requires storage to help smooth short-term variations in energy from wind and solar sources, as well as to respond to spikes in electricity spot prices, which can easily exceed 20 times their average. Efficient operation of an energy storage device is a fundamental problem, yet classical algorithms such asQ-learning can diverge for millions of iterations, limiting p...

متن کامل

Weighted Double Q-learning

2017

Zongzhang Zhang Zhiyuan Pan Mykel J. Kochenderfer

Q-learning is a popular reinforcement learning algorithm, but it can perform poorly in stochastic environments due to overestimating action values. Overestimation is due to the use of a single estimator that uses the maximum action value as an approximation for the maximum expected action value. To avoid overestimation in Qlearning, the double Q-learning algorithm was recently proposed, which u...

متن کامل

Reinforcement Learning with Internal Reward for Multi-Agent Cooperation: A Theoretical Approach

2015

Fumito Uwano Naoki Tatebe Masaya Nakata Keiki Takadama Tim Kovacs

This paper focuses on a multi-agent cooperation which is generally di cult to be achieved without su cient information of other agents, and proposes the reinforcement learning method that introduces an internal reward for a multi-agent cooperation without su cient information. To guarantee to achieve such a cooperation, this paper theoretically derives the condition of selecting appropriate act...

متن کامل

Addressing Function Approximation Error in Actor-Critic Methods

Journal: :CoRR 2018

Scott Fujimoto Herke van Hoof Dave Meger

In value-based reinforcement learning methods such as deep Q-learning, function approximation errors are known to lead to overestimated value estimates and suboptimal policies. We show that this problem persists in an actor-critic setting and propose novel mechanisms to minimize its effects on both the actor and critic. Our algorithm takes the minimum value between a pair of critics to restrict...

متن کامل

Addressing Environment Non-Stationarity by Repeating Q-learning Updates

Journal: :Journal of Machine Learning Research 2016

Sherief Abdallah Michael Kaisers

Q-learning (QL) is a popular reinforcement learning algorithm that is guaranteed to converge to optimal policies in Markov decision processes. However, QL exhibits an artifact: in expectation, the effective rate of updating the value of an action depends on the probability of choosing that action. In other words, there is a tight coupling between the learning dynamics and underlying execution p...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید