Q-learning and policy iteration algorithms for stochastic shortest path problems
نویسندگان
چکیده
منابع مشابه
LIDS REPORT 2871 1 Q - Learning and Policy Iteration Algorithms for Stochastic Shortest Path Problems ∗
We consider the stochastic shortest path problem, a classical finite-state Markovian decision problem with a termination state, and we propose new convergent Q-learning algorithms that combine elements of policy iteration and classical Q-learning/value iteration. These algorithms are related to the ones introduced by the authors for discounted problems in [BY10b]. The main difference from the s...
متن کاملQ-learning and policy iteration algorithms for stochastic shortest path problems
We consider the stochastic shortest path problem, a classical finite-state Markovian decision problem with a termination state, and we propose new convergent Q-learning algorithms that combine elements of policy iteration and classical Q-learning/value iteration. These algorithms are related to the ones introduced by the authors for discounted problems in Bertsekas and Yu (Math. Oper. Res. 37(1...
متن کاملPolicy Iteration Algorithm for Shortest Path Problems
Abstract. The shortest paths tree problem consists in finding a spanning tree rooted at a given node, in a directed weighted graph, such that for each node i , the path of the tree which goes from i to the root has minimal weight. We propose an algorithm which is a deterministic version of Howard’s policy iteration scheme. We show that policy iteration is faster than the Bellman (or value itera...
متن کاملALGORITHMS FOR BIOBJECTIVE SHORTEST PATH PROBLEMS IN FUZZY NETWORKS
We consider biobjective shortest path problems in networks with fuzzy arc lengths. Considering the available studies for single objective shortest path problems in fuzzy networks, using a distance function for comparison of fuzzy numbers, we propose three approaches for solving the biobjective prob- lems. The rst and second approaches are extensions of the labeling method to solve the sing...
متن کاملStochastic Shortest Path Games and Q-Learning
We consider a class of two-player zero-sum stochastic games with finite state and compact control spaces, which we call stochastic shortest path (SSP) games. They are total cost stochastic dynamic games that have a cost-free termination state. Based on their close connection to singleplayer SSP problems, we introduce model conditions that characterize a general subclass of these games that have...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Annals of Operations Research
سال: 2012
ISSN: 0254-5330,1572-9338
DOI: 10.1007/s10479-012-1128-z