A Multiagent Variant of Dyna-Q
نویسنده
چکیده
This paper describes a multiagent variant of Dyna-Q called M-Dyna-Q. Dyna-Q is an integrated single-agent framework for planning, reacting, and learning. Like DynaQ, M-Dyna-Q employs two key ideas: learning results can serve as a valuable input for both planning and reacting, and results of planning and reacting can serve as a valuable input to learning. M-Dyna-Q extends Dyna-Q in that planning, reacting, and learning are jointly realized by mul-
منابع مشابه
An Architectural Framework for Integrated Multiagent Planning, Reacting, and Learning
Dyna is a single-agent architectural framework that integrates learning, planning, and reacting. Well known instantiations of Dyna are Dyna-AC and Dyna-Q. Here a multiagent extension of Dyna-Q is presented. This extension, called M-Dyna-Q, constitutes a novel coordination framework that bridges the gap between plan-based and reactive coordination in multiagent systems. The paper summarizes the ...
متن کاملIntegrated Architectures for Learning , Planning , and ReactingBased
This paper extends previous work with Dyna, a class of architectures for intelligent systems based on approximating dynamic programming methods. Dyna architectures integrate trial-and-error (reinforcement) learning and execution-time planning into a single process operating alternately on the world and on a learned model of the world. In this paper, I present and show results for two Dyna archi...
متن کاملIntegrated Modeling and Control Based on Reinforcement Learning
This is a summary of results with Dyna, a class of architectures for intelligent systems based on approximating dynamic programming methods. Dyna architectures integrate trial-and-error (reinforcement) learning and execution-time planning into a single process operating alternately on the world and on a learned forward model of the world. We describe and show results for two Dyna architectures,...
متن کاملA Fast Learning Agent Based on the Dyna Architecture
In this paper, we present a rapid learning algorithm called Dyna-QPC. The proposed algorithm requires considerably less training time than Q-learning and Table-based Dyna-Q algorithm, making it applicable to real-world control tasks. The Dyna-QPC algorithm is a combination of existing learning techniques: CMAC, Q-learning, and prioritized sweeping. In a practical experiment, the Dyna-QPC algori...
متن کاملReinforcement Learning–Based Energy Management Strategy for a Hybrid Electric Tracked Vehicle
This paper presents a reinforcement learning (RL)–based energy management strategy for a hybrid electric tracked vehicle. A control-oriented model of the powertrain and vehicle dynamics is first established. According to the sample information of the experimental driving schedule, statistical characteristics at various velocities are determined by extracting the transition probability matrix of...
متن کامل