Accelerated decomposition techniques for large discounted Markov decision processes

Authors

  • Abdelhadi Larach Faculty of Sciences and Techniques, Laboratory of Information Processing and Decision Support, Sultan Moulay Slimane University, B.P. 523, Benimellal, Morocco
  • C. Daoui Faculty of Sciences and Techniques, Laboratory of Information Processing and Decision Support, Sultan Moulay Slimane University, B.P. 523, Benimellal, Morocco
  • S. Chafik Faculty of Sciences and Techniques, Laboratory of Information Processing and Decision Support, Sultan Moulay Slimane University, B.P. 523, Benimellal, Morocco
Abstract:

Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorithm, which is a variant of Tarjan’s algorithm that simultaneously finds the SCCs and their belonging levels. Second, a new definition of the restricted MDPs is presented to ameliorate some hierarchical solutions in discounted MDPs using value iteration (VI) algorithm based on a list of state-action successors. Finally, a robotic motion-planning example and the experiment results are presented to illustrate the benefit of the proposed decomposition algorithms.

Upgrade to premium to download articles

Sign up to access the full text

Already have an account?login

similar resources

Discounted Markov decision processes with utility constraints

-We consider utility-constrained Markov decision processes. The expected utility of the total discounted reward is maximized subject to multiple expected utility constraints. By introducing a corresponding Lagrange function, a saddle-point theorem of the utility constrained optimization is derived. The existence of a constrained optimal policy is characterized by optimal action sets specified w...

full text

Simplex Algorithm for Countable-State Discounted Markov Decision Processes

We consider discounted Markov Decision Processes (MDPs) with countably-infinite statespaces, finite action spaces, and unbounded rewards. Typical examples of such MDPs areinventory management and queueing control problems in which there is no specific limit on thesize of inventory or queue. Existing solution methods obtain a sequence of policies that convergesto optimality i...

full text

Improved successive approximation methods for discounted Markov decision processes

Successive Approximation (S.A.) methods, for solving discounted Markov decision problems, have been developed to avoid the extensive computations that are connected with linear programming and policy iteration techniques for solving large scaled problems. Several authors give such an S.A. algorithm. In this paper we introduce some new algorithms while furthermore it will be shown how the severa...

full text

Continuous Time Markov Decision Processes with Expected Discounted Total Rewards

Abstract. This paper discusses continuous time Markov decision processes with criterion of expected discounted total rewards, where the state space is countable, the reward rate function is extended real-valued and the discount rate is a real number. Under necessary conditions that the model is well defined, the state space is partitioned into three subsets, on which the optimal value function ...

full text

Discounted Continuous Time Markov Decision Processes: the Convex Analytic Approach

The convex analytic approach which is dual, in some sense, to dynamic programming, is useful for the investigation of multicriteria control problems. It is well known for discrete time models, and the current paper presents similar results for the continuous time case. Namely, we define and study the space of occupation measures, and apply the abstract convex analysis to the study of constraine...

full text

My Resources

Save resource for easier access later

Save to my library Already added to my library

{@ msg_add @}


Journal title

volume 13  issue 4

pages  -

publication date 2017-12-01

By following a journal you will be notified via email when a new issue of this journal is published.

Hosted on Doprax cloud platform doprax.com

copyright © 2015-2023