Another set of verifiable conditions for average Markov decision processes with Borel spaces
نویسندگان
چکیده
منابع مشابه
Average Optimality for Markov Decision Processes in Borel Spaces: a New Condition and Approach
In this paper we study discrete-time Markov decision processes with Borel state and action spaces. The criterion is to minimize average expected costs, and the costs may have neither upper nor lower bounds. Wefirst provide two average optimality inequalities of opposing directions and give conditions for the existence of solutions to them. Then, using the two inequalities, we ensure the existen...
متن کاملTime and Ratio Expected Average Cost Optimality for Semi-Markov Control Processes on Borel Spaces
We deal with semi-Markov control models with Borel state and control spaces, and unbounded cost functions under the ratio and the time expected average cost criteria. Under suitable growth conditions on the costs and the mean holding times together with stability conditions on the embedded Markov chains, we show the following facts: (i) the ratio and the time average costs coincide in the class...
متن کاملOn the Asymptotic Optimality of Finite Approximations to Markov Decision Processes with Borel Spaces
Abstract. Calculating optimal policies is known to be computationally difficult for Markov decision processes with Borel state and action spaces and for partially observed Markov decision processes even with finite state and action spaces. This paper studies finite-state approximations of discrete time Markov decision processes with Borel state and action spaces, for both discounted and average...
متن کاملLearning Algorithms for Markov Decision Processes with Average Cost
This paper gives the first rigorous convergence analysis of analogs of Watkins’ Q-learning algorithm, applied to average cost control of finite-state Markov chains. We discuss two algorithms which may be viewed as stochastic approximation counterparts of two existing algorithms for recursively computing the value function of average cost problem the traditional relative value iteration algorith...
متن کاملAverage-Reward Decentralized Markov Decision Processes
Formal analysis of decentralized decision making has become a thriving research area in recent years, producing a number of multi-agent extensions of Markov decision processes. While much of the work has focused on optimizing discounted cumulative reward, optimizing average reward is sometimes a more suitable criterion. We formalize a class of such problems and analyze its characteristics, show...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Kybernetika
سال: 2015
ISSN: 0023-5954,1805-949X
DOI: 10.14736/kyb-2015-2-0276