Utilizing Generalized Learning Automata for Finding Optimal Policies in MMDPs
Authors
Abstract:
Multi agent Markov decision processes (MMDPs), as the generalization of Markov decision processes to the multi agent case, have long been used for modeling multi agent system and are used as a suitable framework for Multi agent Reinforcement Learning. In this paper, a generalized learning automata based algorithm for finding optimal policies in MMDP is proposed. In the proposed algorithm, MMDP problem is described as a directed graph in which the nodes are the states of the problem, and the directed edges represent the actions that result in transition from one state to another. Each state of the environment is equipped with a generalized learning automaton whose actions are moving to different adjacent states of that state. Each agent moves from one state to another and tries to reach the goal state. In each state, the agent chooses its next transition with help of the generalized learning automaton in that state. The experimental results have shown that the proposed algorithm have better learning performance in terms of the speed of reaching the optimal policy as compared to existing learning algorithms.
similar resources
utilizing generalized learning automata for finding optimal policies in mmdps
multi agent markov decision processes (mmdps), as the generalization of markov decision processes to the multi agent case, have long been used for modeling multi agent system and are used as a suitable framework for multi agent reinforcement learning. in this paper, a generalized learning automata based algorithm for finding optimal policies in mmdp is proposed. in the proposed algorithm, mmdp ...
full textLearning Automata based Algorithms for Finding Optimal Policies in Fully Cooperative Markov Games
Markov games, as the generalization of Markov decision processes to the multi agent case, have long been used for modeling multi-agent systems. In this paper, several learning automata based multi-agent system algorithms for finding optimal policies in fully-cooperative Markov Games are proposed. In the proposed algorithms, Markov problem is described as a directed graph in which the nodes are ...
full textLearning Automata Based Multi-agent System Algorithms for Finding Optimal Policies in Markov Games
Markov games, as the generalization of Markov decision processes to the multi-agent case, have long been used for modeling multi-agent systems (MAS). The Markov game view of MAS is considered as a sequence of games having to be played by multiple players while each game belongs to a different state of the environment. In this paper, several learning automata based multiagent system algorithms f...
full textFinding Optimal Refueling Policies in Transportation Networks
We study the combinatorial properties of optimal refueling policies, which specify the transportation paths and the refueling operations along the paths to minimize the total transportation costs between vertices. The insight into the structure of optimal refueling policies leads to an elegant reduction of the problem of finding optimal refueling policies into the classical shortest path proble...
full textA linear-time algorithm for finding optimal vehicle refueling policies
We explore a fixed-route vehicle refueling problem as a special case of the inventorycapacitated lot-sizing problem, and present a linear-time greedy algorithm for finding optimal refueling policies.
full textOn Finding Optimal Policies for Markovian Decision Processes Using Simulation
A simulation method is developed, to find an optimal policy for the expected average reward of a Markovian Decision Process. It is shown that the method is consistent, in the sense that it produces solutions arbitrarily close to the optimal. Various types of estimation errors are examined, and bounds are developed.
full textMy Resources
Journal title
volume 6 issue 2
pages 15- 22
publication date 2013-02-01
By following a journal you will be notified via email when a new issue of this journal is published.
Hosted on Doprax cloud platform doprax.com
copyright © 2015-2023