Localization and a Distributed Local Optimal Solution Algorithm for a Class of Multi-Agent Markov Decision Processes
نویسنده
چکیده
We consider discrete-time factorial Markov Decision Processes (MDPs) in multiple decision-makers environment for infinite horizon average reward criterion with a general joint reward structure but a factorial joint state transition structure. We introduce the “localization” concept that a global MDP is localized for each agent such that each agent needs to consider a local MDP defined only with its own state and action spaces. Based on that, we present a gradient-ascent like iterative distributed algorithm that converges to a local optimal solution of the global MDP. The solution is an autonomous joint policy in that each agent’s decision is based on only its local state.
منابع مشابه
Utilizing Generalized Learning Automata for Finding Optimal Policies in MMDPs
Multi agent Markov decision processes (MMDPs), as the generalization of Markov decision processes to the multi agent case, have long been used for modeling multi agent system and are used as a suitable framework for Multi agent Reinforcement Learning. In this paper, a generalized learning automata based algorithm for finding optimal policies in MMDP is proposed. In the proposed algorithm, MMDP ...
متن کاملA Distributed Algorithm for Solving a Class of Multi-agent Markov Decision Problems
We consider a class of infinite horizon Markov decision processes (MDPs) with multiple decision makers, called agents, and a general joint reward structure, but a special decomposable state/action structure such that each individual agent’s actions affect the system’s state transitions independently from the actions of all other agents. We introduce the concept of “localization,” where each age...
متن کاملA Hybrid Algorithm using Firefly, Genetic, and Local Search Algorithms
In this paper, a hybrid multi-objective algorithm consisting of features of genetic and firefly algorithms is presented. The algorithm starts with a set of fireflies (particles) that are randomly distributed in the solution space; these particles converge to the optimal solution of the problem during the evolutionary stages. Then, a local search plan is presented and implemented for searching s...
متن کاملDistributed Generation Expansion Planning Considering Load Growth Uncertainty: A Novel Multi-Period Stochastic Model
Abstract – Distributed generation (DG) technology is known as an efficient solution for applying in distribution system planning (DSP) problems. Load growth uncertainty associated with distribution network is a significant source of uncertainty which highly affects optimal management of DGs. In order to handle this problem, a novel model is proposed in this paper based on DG solution, consideri...
متن کاملAccelerated decomposition techniques for large discounted Markov decision processes
Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...
متن کامل