COMP-627 Project Belief State Space Compression for Bayes-Adaptive POMDPs

نویسنده

  • Stéphane Ross
چکیده

Partially Observable Markov Decision Processes (POMDP) provide a nice mathematical framework for sequential decision making in partially observable stochastic environments. While it is generally assumed that the POMDP model is known, this is rarely the case in practice, as the parameters of the model must be finely tuned to reflect the reality as close as possible. Hence it is of crucial importance to develop new approaches which can take the uncertainty of these parameters into account during the planning process and further refine the model of the POMDP as experience is acquired in the environment. To this end, we have proposed in a previous project the Bayes-Adaptive POMDP model which allows one to both learn the POMDP model and plan by considering the model uncertainty and the value of learning information about the POMDP model. One problem with this model is that it has a infinite state space, which makes computing solution to this problem a very challenging task. To alleviate this problem, we propose a semimetric that allows one to approximate this infinite state space by a finite set of states that can preserve the value function of the infinite dimensionnal belief state space to arbitrary precision. This will allow us to define a new finite POMDP that we could theoretically solve using standard methods, which solution will be arbitrarily close to the optimal solution of the Bayes-Adaptive POMDP.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bayes-Adaptive Interactive POMDPs

We introduce the Bayes-Adaptive Interactive Partially Observable Markov Decision Process (BA-IPOMDP), the first multiagent decision model that explicitly incorporates model learning. As in I-POMDPs, the BA-IPOMDP agent maintains beliefs over interactive states, which include the physical states as well as the other agents’ models. The BA-IPOMDP assumes that the state transition and observation ...

متن کامل

Exact Dynamic Programming for Decentralized POMDPs with Lossless Policy Compression

High dimensionality of belief space in DEC-POMDPs is one of the major causes that makes the optimal joint policy computation intractable. The belief state for a given agent is a probability distribution over the system states and the policies of other agents. Belief compression is an efficient POMDP approach that speeds up planning algorithms by projecting the belief state space to a low-dimens...

متن کامل

Exponential Family PCA for Belief Compression in POMDPs

Standard value function approaches to finding policies for Partially Observable Markov Decision Processes (POMDPs) are intractable for large models. The intractability of these algorithms is due to a great extent to their generating an optimal policy over the entire belief space. However, in real POMDP problems most belief states are unlikely, and there is a structured, low-dimensional manifold...

متن کامل

Compressing POMDPs Using Locality Preserving Non-Negative Matrix Factorization

Partially Observable Markov Decision Processes (POMDPs) are a well-established and rigorous framework for sequential decision-making under uncertainty. POMDPs are well-known to be intractable to solve exactly, and there has been significant work on finding tractable approximation methods. One well-studied approach is to find a compression of the original POMDP by projecting the belief states to...

متن کامل

On the Linear Belief Compression of POMDPs: A re-examination of current methods

Belief compression improves the tractability of large-scale partially observable Markov decision processes (POMDPs) by finding projections from high-dimensional belief space onto low-dimensional approximations, where solving to obtain action selection policies requires fewer computations. This paper develops a unified theoretical framework to analyse three existing linear belief compression app...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007