Robust Planning in Domains with Stochastic Outcomes, Adversaries, and Partial Observability
نویسندگان
چکیده
Real-world planning problems often feature multiple sources of uncertainty, including randomness in outcomes, the presence of adversarial agents, and lack of complete knowledge of the world state. This thesis describes algorithms for four related formal models that can address multiple types of uncertainty: Markov decision processes, MDPs with adversarial costs, extensiveform games, and a new class of games that includes both extensive-form games and MDPs as special cases. Markov decision processes can represent problems where actions have stochastic outcomes. We describe several new algorithms for MDPs, and then show how MDPs can be generalized to model the presence of an adversary that has some control over costs. Extensive-form games can model games with random events and partial observability. In the zero-sum perfect-recall case, a minimax solution can be found in time polynomial in the size of the game tree. However, the game tree must “remember” all past actions and random outcomes, and so the size of the game tree grows exponentially in the length of the game. This thesis introduces a new generalization of extensive-form games that relaxes this need to remember all past actions exactly, producing exponentially smaller representations for interesting problems. Further, this formulation unifies extensive-form games with MDP planning. We present a new class of fast anytime algorithms for the off-line computation of minimax equilibria in both traditional and generalized extensive-form games. Experimental results demonstrate their effectiveness on an adversarial MDP problem and on a large abstracted poker game. We also present a new algorithm for playing repeated extensive-form games that can be used when only the total payoff of the game is observed on each round.
منابع مشابه
Planning with Extended Goals and Partial Observability
Planning in nondeterministic domains with temporally extended goals under partial observability is one of the most challenging problems in planning. Simpler subsets of this problem have been already addressed in the literature, but the general combination of extended goals and partial observability is, to the best of our knowledge, still an open problem. In this paper we present a first attempt...
متن کاملHierarchical Task Planning under Uncertainty
In this paper we present an algorithm for planning in nondeterministic domains. Our algorithm C-SHOP extends the successful classical HTN planner SHOP, by introducing new mechanisms to handle situations where there is incomplete and uncertain information about the state of the environment. Being an HTN planner, C-SHOP supports coding domain-dependent knowledge in a powerful way that describes h...
متن کاملPlanning with Nondeterministic Actions and Sensing
Many planning problems involve nondeterministic actions actions whose effects are not completely determined by the state of the world before the action is executed. In this paper we consider the computational complexity of planning in domains where such actions are available. We give a formal model of nondeterministic actions and sensing, together with an action language for specifying planning...
متن کاملPlanning in Nondeterministic Domains under Partial Observability via Symbolic Model Checking
Planning under partial observability is one of the most significant and challenging planning problems. It has been shown to be hard, both theoretically and experimentally. In this paper, we present a novel approach to the problem of planning under partial observability in non-deterministic domains. We propose an algorithm that searches through a (possibly cyclic) and-or graph induced by the dom...
متن کاملA Framework for Planning with Extended Goals under Partial Observability
Planning in nondeterministic domains with temporally extended goals under partial observability is one of the most challenging problems in planning. Subsets of this problem have been already addressed in the literature. For instance, planning for extended goals has been developed under the simplifying hypothesis of full observability. And the problem of a partial observability has been tackled ...
متن کامل