COMP-627 Project Belief State Space Compression for Bayes-Adaptive POMDPs
نویسنده
چکیده
Partially Observable Markov Decision Processes (POMDP) provide a nice mathematical framework for sequential decision making in partially observable stochastic environments. While it is generally assumed that the POMDP model is known, this is rarely the case in practice, as the parameters of the model must be finely tuned to reflect the reality as close as possible. Hence it is of crucial importance to develop new approaches which can take the uncertainty of these parameters into account during the planning process and further refine the model of the POMDP as experience is acquired in the environment. To this end, we have proposed in a previous project the Bayes-Adaptive POMDP model which allows one to both learn the POMDP model and plan by considering the model uncertainty and the value of learning information about the POMDP model. One problem with this model is that it has a infinite state space, which makes computing solution to this problem a very challenging task. To alleviate this problem, we propose a semimetric that allows one to approximate this infinite state space by a finite set of states that can preserve the value function of the infinite dimensionnal belief state space to arbitrary precision. This will allow us to define a new finite POMDP that we could theoretically solve using standard methods, which solution will be arbitrarily close to the optimal solution of the Bayes-Adaptive POMDP.
منابع مشابه
Bayes-Adaptive Interactive POMDPs
We introduce the Bayes-Adaptive Interactive Partially Observable Markov Decision Process (BA-IPOMDP), the first multiagent decision model that explicitly incorporates model learning. As in I-POMDPs, the BA-IPOMDP agent maintains beliefs over interactive states, which include the physical states as well as the other agents’ models. The BA-IPOMDP assumes that the state transition and observation ...
متن کاملExact Dynamic Programming for Decentralized POMDPs with Lossless Policy Compression
High dimensionality of belief space in DEC-POMDPs is one of the major causes that makes the optimal joint policy computation intractable. The belief state for a given agent is a probability distribution over the system states and the policies of other agents. Belief compression is an efficient POMDP approach that speeds up planning algorithms by projecting the belief state space to a low-dimens...
متن کاملExponential Family PCA for Belief Compression in POMDPs
Standard value function approaches to finding policies for Partially Observable Markov Decision Processes (POMDPs) are intractable for large models. The intractability of these algorithms is due to a great extent to their generating an optimal policy over the entire belief space. However, in real POMDP problems most belief states are unlikely, and there is a structured, low-dimensional manifold...
متن کاملCompressing POMDPs Using Locality Preserving Non-Negative Matrix Factorization
Partially Observable Markov Decision Processes (POMDPs) are a well-established and rigorous framework for sequential decision-making under uncertainty. POMDPs are well-known to be intractable to solve exactly, and there has been significant work on finding tractable approximation methods. One well-studied approach is to find a compression of the original POMDP by projecting the belief states to...
متن کاملOn the Linear Belief Compression of POMDPs: A re-examination of current methods
Belief compression improves the tractability of large-scale partially observable Markov decision processes (POMDPs) by finding projections from high-dimensional belief space onto low-dimensional approximations, where solving to obtain action selection policies requires fewer computations. This paper develops a unified theoretical framework to analyse three existing linear belief compression app...
متن کامل