On the Complexity of Finite Markov Decision Processes

نویسندگان

  • Danièle Beauquier
  • Dima Burago
  • Michel de Rougemont
  • Anatol Slissenko
چکیده

Introduction Main notions Optimal strategies under Total Observability Complexity of Markov strategies. Finite Memory Strategies. Randomized strategies. Total Unobservability. Bounded Unobservability. Address: Université Paris-12, Equipe d’Informatique Fondamentale, 61, Ave. du Général de Gaulle, 94010 Créteil, France. E-mail: [email protected] Address: Dept. of Mathematics, Pennsylvania State University, University Park, PA 16802, USA E-mail: [email protected] \ The research of this author was supported by DRET and Armines contract 92-0171.00.1013. †St-Petersburg Institute for Informatics and Automation of the Academy of Sciences of Russia Address: Université Paris-11, Centre Orsay, L.R.I., Bât. 490, F-91405 Orsay, France. E-mail: [email protected] Address: Université Paris-12, Equipe d’Informatique Fondamentale, 61, Ave. du Général de Gaulle, 94010 Créteil, France. E-mail: [email protected] [ The research of this author was partially supported by DRET contract 91/1061.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Accelerated decomposition techniques for large discounted Markov decision processes

Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...

متن کامل

The Complexity of Deterministically Observable Finite-Horizon Markov Decision Processes

We consider the complexity of the decision problem for diierent types of partially-observable Markov decision processes (MDPs): given an MDP, does there exist a policy with performance > 0? Lower and upper bounds on the complexity of the decision problems are shown in terms of completeness for NL, P, NP, PSPACE, EXP, NEXP or EXPSPACE, dependent on the type of the Markov decision process. For se...

متن کامل

Extended Geometric Processes: Semiparametric Estimation and Application to ReliabilityImperfect repair, Markov renewal equation, replacement policy

Lam (2007) introduces a generalization of renewal processes named Geometric processes, where inter-arrival times are independent and identically distributed up to a multiplicative scale parameter, in a geometric fashion. We here envision a more general scaling, not necessar- ily geometric. The corresponding counting process is named Extended Geometric Process (EGP). Semiparametric estimates are...

متن کامل

Reduction of Computational Complexity in Finite State Automata Explosion of Networked System Diagnosis (RESEARCH NOTE)

This research puts forward rough finite state automata which have been represented by two variants of BDD called ROBDD and ZBDD. The proposed structures have been used in networked system diagnosis and can overcome cominatorial explosion. In implementation the CUDD - Colorado University Decision Diagrams package is used. A mathematical proof for claimed complexity are provided which shows ZBDD ...

متن کامل

Strategy Complexity of Finite-Horizon Markov Decision Processes and Simple Stochastic Games

Markov decision processes (MDPs) and simple stochastic games (SSGs) provide a rich mathematical framework to study many important problems related to probabilistic systems. MDPs and SSGs with finite-horizon objectives, where the goal is to maximize the probability to reach a target state in a given finite time, is a classical and well-studied problem. In this work we consider the strategy compl...

متن کامل

Risk-Sensitive and Average Optimality in Markov Decision Processes

Abstract. This contribution is devoted to the risk-sensitive optimality criteria in finite state Markov Decision Processes. At first, we rederive necessary and sufficient conditions for average optimality of (classical) risk-neutral unichain models. This approach is then extended to the risk-sensitive case, i.e., when expectation of the stream of one-stage costs (or rewards) generated by a Mark...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005