Using core beliefs for point-based value iteration
نویسندگان
چکیده
Recent research on point-based approximation algorithms for POMDPs demonstrated that good solutions to POMDP problems can be obtained without considering the entire belief simplex. For instance, the Point Based Value Iteration (PBVI) algorithm [Pineau et al., 2003] computes the value function only for a small set of belief states and iteratively adds more points to the set as needed. A key component of the algorithm is the strategy for selecting belief points, such that the space of reachable beliefs is well covered. This paper presents a new method for selecting an initial set of representative belief points, which relies on finding first the basis for the reachable belief simplex. Our approach has better worst-case performance than the original PBVI heuristic, and performs well in several standard POMDP tasks.
منابع مشابه
Approximate Solutions of Interactive POMDPs Using Point Based Value Iteration
We develop a point based method for solving finitely nested interactive POMDPs approximately. Analogously to point based value iteration (PBVI) in POMDPs, we maintain a set of belief points and form value functions composed of only those value vectors that are optimal at these points. However, as we focus on multiagent settings, the beliefs are nested and the computation of the value vectors re...
متن کاملGeneralized Point Based Value Iteration for Interactive POMDPs
We develop a point based method for solving finitely nested interactive POMDPs approximately. Analogously to point based value iteration (PBVI) in POMDPs, we maintain a set of belief points and form value functions composed of those value vectors that are optimal at these points. However, as we focus on multiagent settings, the beliefs are nested and computation of the value vectors relies on p...
متن کاملBelief Selection in Point-Based Planning Algorithms for POMDPs
Current point-based planning algorithms for solving partially observable Markov decision processes (POMDPs) have demonstrated that a good approximation of the value function can be derived by interpolation from the values of a specially selected set of points. The performance of these algorithms can be improved by eliminating unnecessary backups or concentrating on more important points in the ...
متن کاملPoint Based Value Iteration with Optimal Belief Compression for Dec-POMDPs
We present four major results towards solving decentralized partially observable Markov decision problems (DecPOMDPs) culminating in an algorithm that outperforms all existing algorithms on all but one standard infinite-horizon benchmark problems. (1) We give an integer program that solves collaborative Bayesian games (CBGs). The program is notable because its linear relaxation is very often in...
متن کاملOptimal Control of Hand, Foot and Mouth Disease Model using Variational Iteration Method
In this paper, the optimal control of transmission dynamics of hand, foot and mouth disease (HFMD), formulated by a compartmental deterministic SEIPR (Susceptible-Incubation (Exposed)- Infected - Post infection virus shedding - Recovered) model with vaccination and treatment as control parameters is considered. The objective function is based on the combination of minimizing the number of infec...
متن کامل