Generalized Point Based Value Iteration for Interactive POMDPs

نویسندگان

  • Prashant Doshi
  • Dennis Perez
چکیده

We develop a point based method for solving finitely nested interactive POMDPs approximately. Analogously to point based value iteration (PBVI) in POMDPs, we maintain a set of belief points and form value functions composed of those value vectors that are optimal at these points. However, as we focus on multiagent settings, the beliefs are nested and computation of the value vectors relies on predicted actions of others. Consequently, we develop a novel interactive generalization of PBVI applicable to multiagent settings.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Approximate Solutions of Interactive POMDPs Using Point Based Value Iteration

We develop a point based method for solving finitely nested interactive POMDPs approximately. Analogously to point based value iteration (PBVI) in POMDPs, we maintain a set of belief points and form value functions composed of only those value vectors that are optimal at these points. However, as we focus on multiagent settings, the beliefs are nested and the computation of the value vectors re...

متن کامل

Anytime Point Based Approximations for Interactive POMDPs

Partially observable Markov decision processes (POMDPs) have been largely accepted as a rich-framework for planning and control problems. In settings where multiple agents interact POMDPs prove to be inadequate. The interactive partially observable Markov decision process (I-POMDP) is a new paradigm that extends POMDPs to multiagent settings. The added complexity of this model due to the modeli...

متن کامل

Generalized and bounded policy iteration for finitely-nested interactive POMDPs: scaling up

Policy iteration algorithms for partially observable Markov decision processes (POMDP) offer the benefits of quick convergence and the ability to operate directly on the solution, which usually takes the form of a finite state controller. However, the controller tends to grow quickly in size across iterations due to which its evaluation and improvement become costly. Bounded policy iteration pr...

متن کامل

Generalized and Bounded Policy Iteration for Interactive POMDPs

Policy iteration algorithms for solving partially observable Markov decision processes (POMDP) offer the benefits of quicker convergence and the ability to operate directly on the policy, which usually takes the form of a finite state controller. However, the controller tends to grow quickly in size across iterations due to which its evaluation and improvement become costly. Bounded policy iter...

متن کامل

Symbolic Dynamic Programming for Continuous State and Observation POMDPs

Point-based value iteration (PBVI) methods have proven extremely effective for finding (approximately) optimal dynamic programming solutions to partiallyobservable Markov decision processes (POMDPs) when a set of initial belief states is known. However, no PBVI work has provided exact point-based backups for both continuous state and observation spaces, which we tackle in this paper. Our key in...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008