Point-Based Value Iteration for Finite-Horizon POMDPs
نویسندگان
چکیده
منابع مشابه
Point-Based Value Iteration for Continuous POMDPs
We propose a novel approach to optimize Partially Observable Markov Decisions Processes (POMDPs) defined on continuous spaces. To date, most algorithms for model-based POMDPs are restricted to discrete states, actions, and observations, but many real-world problems such as, for instance, robot navigation, are naturally defined on continuous spaces. In this work, we demonstrate that the value fu...
متن کاملPoint-Based Value Iteration for Constrained POMDPs
Constrained partially observable Markov decision processes (CPOMDPs) extend the standard POMDPs by allowing the specification of constraints on some aspects of the policy in addition to the optimality objective for the value function. CPOMDPs have many practical advantages over standard POMDPs since they naturally model problems involving limited resource or multiple objectives. In this paper, ...
متن کاملPerseus: Randomized Point-based Value Iteration for POMDPs
Partially observable Markov decision processes (POMDPs) form an attractive and principled framework for agent planning under uncertainty. Point-based approximate techniques for POMDPs compute a policy based on a finite set of points collected in advance from the agent’s belief space. We present a randomized point-based value iteration algorithm called Perseus. The algorithm performs approximate...
متن کاملGeneralized Point Based Value Iteration for Interactive POMDPs
We develop a point based method for solving finitely nested interactive POMDPs approximately. Analogously to point based value iteration (PBVI) in POMDPs, we maintain a set of belief points and form value functions composed of those value vectors that are optimal at these points. However, as we focus on multiagent settings, the beliefs are nested and computation of the value vectors relies on p...
متن کاملPoint-based value iteration: An anytime algorithm for POMDPs
This paper introduces the Point-Based Value Iteration (PBVI) algorithm for POMDP planning. PBVI approximates an exact value iteration solution by selecting a small set of representative belief points, and planning for those only. By using stochastic trajectories to choose belief points, and by maintaining only one value hyperplane per point, it is able to successfully solve large problems, incl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Artificial Intelligence Research
سال: 2019
ISSN: 1076-9757
DOI: 10.1613/jair.1.11324