Identifying and exploiting weak-information inducing actions in solving POMDPs
نویسندگان
چکیده
We present a method for identifying actions that lead to observations which are only weakly informative in the context of partially observable Markov decision processes (POMDP). We call such actions as weak(inclusive of zero-) information inducing. Policy subtrees rooted at these actions may be computed more efficiently. While zero-information inducing actions may be exploited without error, the quicker backup for weak but non-zero information inducing actions may introduce error. We empirically demonstrate the substantial computational savings that exploiting such actions may bring to exact and approximate solutions of POMDPs while maintaining the solution quality.
منابع مشابه
Eecient Planning in Stochastic Domains through Exploiting Problem Characteristics
Partially observable Markov decision process (POMDP) can be used as a model for planning in stochastic domains. However, general POMDPs are computationally expensive to solve. This paper investigates how might problem characteristics be exploited to cut down computation. We consider planning problems where observations are informative of the world state and there are not much uncertainties in e...
متن کاملApproximate solutions for factored Dec-POMDPs with many agents
Dec-POMDPs are a powerful framework for planning in multiagent systems, but are provably intractable to solve. Despite recent work on scaling to more agents by exploiting weak couplings in factored models, scalability for unrestricted subclasses remains limited. This paper proposes a factored forward-sweep policy computation method that tackles the stages of the problem one by one, exploiting w...
متن کاملApproximate Solutions for Factored Dec-POMDPs with Many Agents1
Dec-POMDPs are a powerful framework for planning in multiagent systems, but are provably intractable to solve. This paper proposes a factored forward-sweep policy computation method that tackles the stages of the problem one by one, exploiting weakly coupled structure at each of these stages. An empirical evaluation shows that the loss in solution quality due to these approximations is small an...
متن کاملApproximate Solutions for Factored Dec-POMDPs with Many Agents — Extended Abstract1
Dec-POMDPs are a powerful framework for planning in multiagent systems, but are provably intractable to solve. This paper proposes a factored forward-sweep policy computation method that tackles the stages of the problem one by one, exploiting weakly coupled structure at each of these stages. An empirical evaluation shows that the loss in solution quality due to these approximations is small an...
متن کاملProperly Acting under Partial Observability with Action Feasibility Constraints
We introduce Action-Constrained Partially Observable Markov Decision Process (AC-POMDP), which arose from studying critical robotic applications with damaging actions. AC-POMDPs restrict the optimized policy to only apply feasible actions: each action is feasible in a subset of the state space, and the agent can observe the set of applicable actions in the current hidden state, in addition to s...
متن کامل