q learning

Design, analysis and comparison of robot learners

Journal: :Robotics and Autonomous Systems 1998

Jeremy Wyatt John Hoar Gillian Hayes

This paper outlines some ideas as to how robot learning experiments might best be designed. There are three principal ndings: (i) in order to evaluate robot learners we must employ multiple evaluation methods together; (ii) in order to measure in any absolute way the performance of a learning algorithm we must characterise the complexity of the underlying decision task formed by the interaction...

متن کامل

Reinforcement Learning approach for Real Time Strategy Games Battle city and S3

Journal: :CoRR 2016

Harshit Sethy Amit Patel

In this paper we proposed reinforcement learning algorithms with the generalized reward function. In our proposed method we use Q-learning and SARSA algorithms with generalised reward function to train the reinforcement learning agent. We evaluated the performance of our proposed algorithms on two real-time strategy games called BattleCity and S3. There are two main advantages of having such an...

متن کامل

Tentative Exploration on Reinforcement Learning Algorithms for Stochastic Rewards

2009

Luis Peña Antonio LaTorre José María Peña Sánchez Sascha Ossowski

This paper addresses a way to generate mixed strategies using reinforcement learning algorithms in domains with stochastic rewards. A new algorithm, based on Q-learning model, called TERSQ is introduced. As a difference from other approaches for stochastic scenarios, TERSQ uses a global exploration rate for all the state/actions in the same run. This exploration rate is selected at the beginnin...

متن کامل

Reinforcement learning and neural reinforcement learning

1994

Samira Sehad Claude F. Touzet

In this paper, we address an under-represented class of learning algorithms in the study of connectionism: reinforcement learning. We first introduce these classic methods in a new formalism which highlights the particularities of implementations such as Q-Learning, QLearning with Hamming distance, Q-Learning with statistical clustering and Dyna-Q. We then present in this formalism a neural imp...

متن کامل

A Comparison of Exploration/Exploitation Techniques for a Q-Learning Agent in the Wumpus World

2008

A. Friesen

The Q-Learning algorithm, suggested by Watkins [1], has become one of the most popular reinforcement learning algorithms due to its relatively simple implementation and the complexity reduction gained by the use of a model-free method. However, QLearning does not specify how to trade off exploration of the world for exploitation of the developed policy. Multiple such tradeoffs are possible and ...

متن کامل

Continuous Valued Q-learning for Vision-Guided Behavior Acquisition

1999

Yasutake Takahashi Masanori Takeda Minoru Asada

Q-learning, a most widely used reinforcement learning method, normally needs well-defined quantized state and action spaces to converge. This makes it difficult to be applied to real robot tasks because of poor performance of learned behavior and further a new problem of state space construction. This paper proposes a continuous valued Q-learning for real robot applications, which calculates co...

متن کامل

Parallel Q-learning for a block-pushing problem

2001

Guillaume J. Laurent Emmanuel Piat

This paper presents an application of reinforcement learning to a block-pushing problem. The manipulator system we used is able to push millimeter size objects on a glass slide under a CCD camera. The objective is to automate high level tasks of pushing. Our approach is based on reinforcement learning algorithm (Q-Learning) because the models of the manipulator and of the dynamics of objects ar...

متن کامل

Time manipulation technique for speeding up reinforcement learning in simulations

Journal: :CoRR 2008

Petar Kormushev Kohei Nomoto Fangyan Dong Kaoru Hirota

A technique for speeding up reinforcement learning algorithms by using time manipulation is proposed. It is applicable to failure-avoidance control problems running in a computer simulation. Turning the time of the simulation backwards on failure events is shown to speed up the learning by 260% and improve the state space exploration by 12% on the cart-pole balancing task, compared to the conve...

متن کامل

A Multiagent Variant of Dyna-Q

2000

Gerhard Weiß

This paper describes a multiagent variant of Dyna-Q called M-Dyna-Q. Dyna-Q is an integrated single-agent framework for planning, reacting, and learning. Like DynaQ, M-Dyna-Q employs two key ideas: learning results can serve as a valuable input for both planning and reacting, and results of planning and reacting can serve as a valuable input to learning. M-Dyna-Q extends Dyna-Q in that planning...

متن کامل

outsourcing or insourcing of transportation system evaluation using intelligent agents approach

Journal: :journal of optimization in industrial engineering 2010

isa nakhaei kamalabadi parham azimi mohammad varmaghani

nowadays, outsourcing is viewed as a trade strategy and organizations tend to adopt new strategies to achieve competitive advantages in the current world of business. focusing on main copmpetencies, and transferring most of activities to outside resources of organization( outsourcing) is one such strategy is. in this paper, we aim to decide on decision maker agent of transportation system, by a...

متن کامل