Policy Filtering for Planning in Partially Observable Stochastic Domains

نویسنده

  • Nevin Lianwen Zhang
چکیده

Partially observable Markov decision processes (POMDP) can be used as a model for planning in stochastic domains. This paper considers the problem of computing optimal policies for nite horizon POMDPs. In deciding on an action to take, an agent is not only concerned with how the action would a ect the current time point, but also its impacts on the rest of the planning horizon. In a POMDP, the future e ects of an action are not separable from the e ects of the agent's future behavior. Consequently, one needs to consider the agent's future behavior in order to properly evaluate the future impacts of an action. One reason that makes POMDPs di cult to solve is that the agent can behave in an exponential number (in the length of the remaining of the planning horizon) of ways. This paper represents the agent's future behavior in terms of sub-policy-trees and gives a method that reduces the number of possible sub-policy-trees by collapsing similar sub-policy-trees and by pruning inferior sub-policy-trees.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automated Hierarchy Discovery for Planning in Partially Observable Domains

Planning in partially observable domains is a notoriously difficult problem. However, in many real-world scenarios, planning can be simplified by decomposing the task into a hierarchy of smaller planning problems which, can then be solved independently of one another. Several approaches, mainly dealing with fully observable domains, have been proposed to optimize a plan that decomposes accordin...

متن کامل

Contingent Planning Under Uncertainty via Stochastic Satisfiability

We describe a new planning technique that efficiently solves probabilistic propositional contingent planning problems by converting them into instances of stochastic satisfiability (SSat) and solving these problems instead. We make fundamental contributions in two areas: the solution of SSat problems and the solution of stochastic planning problems. This is the first work extending the planning...

متن کامل

Position Statement: Contingent Planning in Partially Observable Stochastic Domains via Stochastic Satisfiability

Our research has successfully extended the plann!ngas-satisfiability paradigm to support contingent planning under uncertainty (uncertain initial conditions, probabilistic effects of actions, uncertain state estimation). Stochastic satisfiability (SSAT), type of Boolean satisfiability problem in which some of the variables have probabilities attached to them, forms the basis of this extension. ...

متن کامل

Planning in Stochastic Domains: Problem Characteristics and Approximation

This paper is about planning in stochastic domains by means of partially observable Markov decision processes (POMDPs). POMDPs are di cult to solve. This paper considers problems where one, although does not know the true state of the world, has a pretty good idea about it and uses such problem characteristics to transform POMDPs into approximately equivalent ones that are much easier to solve....

متن کامل

Monitoring plan execution in partially observable stochastic worlds

This thesis presents two novel algorithms for monitoring plan execution in stochastic partially observable environments. The problems can be naturally formulated as partially-observable Markov decision processes (POMDPs). Exact solutions of POMDP problems are difficult to find due to the computational complexity, so many approximate solutions are proposed instead. These POMDP solvers tend to ge...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995