Scaling Techniques for Large Markov Decision Process Plan- ning Problems
نویسندگان
چکیده
Planning in Large Domains: The Markov decision process (MDP) formalism has emerged as a powerful representation for control and planning domains that are subject to stochastic effects. In particular, MDPs model situations in which an agent can exactly observe all relevant aspects of the world’s state but in which the effects of the agent’s actions are nondeterministic. Though the theory of MDPs is well developed and exact planning algorithms are known [3], these methods do not scale to the exponentially large state spaces that are commonly of interest in AI problems. In this project, we are examining approaches to reducing the complexity of MDP planning techniques in such large state spaces with an emphasis on classes of problems that arise in mobile robotics applications.
منابع مشابه
Accelerated decomposition techniques for large discounted Markov decision processes
Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...
متن کاملMonitoring plan execution in partially observable stochastic worlds
This thesis presents two novel algorithms for monitoring plan execution in stochastic partially observable environments. The problems can be naturally formulated as partially-observable Markov decision processes (POMDPs). Exact solutions of POMDP problems are difficult to find due to the computational complexity, so many approximate solutions are proposed instead. These POMDP solvers tend to ge...
متن کاملA Hybridized Planner for Stochastic Domains
Markov Decision Processes are a powerful framework for planning under uncertainty, but current algorithms have difficulties scaling to large problems. We present a novel probabilistic planner based on the notion of hybridizing two algorithms. In particular, we hybridize GPT, an exact MDP solver, with MBP, a planner that plans using a qualitative (nondeterministic) model of uncertainty. Whereas ...
متن کاملLearning Policies for Partially Observable Environments: Scaling Up
Partially observable Markov decision processes (pomdp's) model decision problems in which an agent tries to maximize its reward in the face of limited and/or noisy sensor feedback. While the study of pomdp's is motivated by a need to address realistic problems, existing techniques for nding optimal behavior do not appear to scale well and have been unable to nd satisfactory policies for problem...
متن کاملLearning policies for partially observable environments : Scaling upMichael
Partially observable Markov decision processes (pomdp's) model decision problems in which an agent tries to maximize its reward in the face of limited and/or noisy sensor feedback. While the study of pomdp's is motivated by a need to address realistic problems , existing techniques for nding optimal behavior do not appear to scale well and have been unable to nd satisfactory policies for proble...
متن کامل