Optimal Convergence in Multi-Agent MDPs
نویسندگان
چکیده
Learning Automata (LA) were recently shown to be valuable tools for designing Multi-Agent Reinforcement Learning algorithms. One of the principal contributions of LA theory is that a set of decentralized, independent learning automata is able to control a finite Markov Chain with unknown transition probabilities and rewards. We extend this result to the framework of Multi-Agent MDP’s, a straightforward extension of single-agent MDP’s to distributed cooperative multi-agent decision problems. Furthermore, we combine this result with the application of parametrized learning automata yielding global optimal convergence results.
منابع مشابه
A class of multi-agent discrete hybrid non linearizable systems: Optimal controller design based on quasi-Newton algorithm for a class of sign-undefinite hessian cost functions
In the present paper, a class of hybrid, nonlinear and non linearizable dynamic systems is considered. The noted dynamic system is generalized to a multi-agent configuration. The interaction of agents is presented based on graph theory and finally, an interaction tensor defines the multi-agent system in leader-follower consensus in order to design a desirable controller for the noted system. A...
متن کاملAsymptotic properties of constrained Markov Decision Processes
We present in this paper several asymptotic properties of constrained Markov Decision Processes (MDPs) with a countable state space. We treat both the discounted and the expected average cost, with unbounded cost. We are interested in (1) the convergence of nite horizon MDPs to the innnite horizon MDP, (2) convergence of MDPs with a truncated state space to the problem with innnite state space,...
متن کاملEvent-Detecting Multi-Agent MDPs: Complexity and Constant-Factor Approximations
Planning under uncertainty for multiple agents has grown rapidly with the development of formal models such as multi-agent MDPs and decentralized MDPs. But despite their richness, the applicability of these models remains limited due to their computational complexity. We present the class of event-detecting multi-agent MDPs (eMMDPs), designed to detect multiple mobile targets by a team of senso...
متن کاملEvent-Detecting Multi-Agent MDPs: Complexity and Constant-Factor Approximation
Planning under uncertainty for multiple agents has grown rapidly with the development of formal models such as multi-agent MDPs and decentralized MDPs. But despite their richness, the applicability of these models remains limited due to their computational complexity. We present the class of event-detecting multi-agent MDPs (eMMDPs), designed to detect multiple mobile targets by a team of senso...
متن کاملAn Accelerated Gradient Method for Distributed Multi-Agent Planning with Factored MDPs
We study optimization for collaborative multi-agent planning in factored Markov decision processes (MDPs) with shared resource constraints. Following past research, we derive a distributed planning algorithm for this setting based on Lagrangian relaxation: we optimize a convex dual function which maps a vector of resource prices to a bound on the achievable utility. Since the dual function is n...
متن کامل