Dynamic Abstraction Planning
نویسندگان
چکیده
plan (isd); isd is initial state description let N = ;; The graph openlist = ;; is = make-initial-state(isd); N := N [ fisg; push(is, openlist); loopif there are no more reachable states in the openlist then we are done break; else let s = choose a reachable state from openlist; openlist := openlist fsg; oneof split-state : choose a proposition p and split s into jval(p)j states; remove s from N and insert the new states; add the new states to the open list; assign-action : choose an action (or no-op) that is applicable for s; fail Figure 5: The DAP planning algorithm. to split. We may not be able to split the state productively even if the state is only partially speci ed. No further splitting will be productive if we can determine that some bad transition must occur in the state, that the state is reachable, and that there are no available actions with which to preempt the bad transition. The structure of the NFA being constructed guides us in backtracking. When we fail to successfully handle a state, we backjump to the earliest solved state (we keep these on a closed list) that has an edge into the failed state. Because the state is reachable, there must be a state with an edge into it, unless the state is the starting state. If we fail on the starting state, the search as a whole has failed. Note that we do not backtrack over state re nements. Backtracking over these re nements is never necessary: for every plan that can be found at a low level of detail, there is a corresponding plan at every higher level. Our experience suggests that the cost of \coarsening" an NFA (and the additional bookkeeping necessary to provide this option) is not worth the small savings in graph size. Through additional backtracking, we provide a simple anytime behavior. The AIS caches plans as they are produced (recall that all plans are safety-preserving). Through backtracking, the AIS can generate plans that satisfy more of the goal propositions. Thus once a rst safety-producing plan is generated, the AIS may at will invest more time into generating better plans. There are two aspects to the heuristic control of the search: the search should be directed to achieve safety and to move the system towards states that satisfy as many goal propositions as possible. To make the search for goal propositions most e cient, the rst action the DAP planner takes is to split the initial state according to the goal propositions. The heuristic we use for directing the choice of actions and re nements is a modi ed version of McDermott's heuristic estimator for state-based ADL planning (McDermott 1996). When choosing how to handle a state, the planner constructs an operator-proposition graph connecting the current state description to the goal state description. This is a layered graph, with alternating layers containing nodes that represent propositions to be achieved and operators that can establish those propositions. Despite using full lookahead, this approach is heuristic and e cient because it ignores details such as interactions between operators. Our version di ers from McDermott's because our actions are simple STRIPS operators; his approach covers schemas as well and must consider variable binding. Another di erence is that McDermott's is a more traditional state-space planner, so state descriptions are complete and the only way to establish a proposition is to apply an operator with the appropriate postconditions. Our state descriptions are partial, and one way for the DAP planner to establish a proposition is to re ne a partial state description to include that propoS1,1 Ftr T S1,2 FAILURE F Ftr NIL0SS1FAILUREF0SSPLITFigure 6: Using re nement to isolate a failure.sition. Note that this operation is similar to the kindof conditional planning done by CNLP (Peot & Smith1992) and Plinth (Goldman & Boddy 1994): when theplanner cannot determine a priori the value of a propo-sition, it plans for both alternatives.The planner combines information about the contextof a state with the heuristic information provided bythe operator-proposition graph. For example, whenchoosing between several interesting propositions onwhich to re ne a state, the planner will prefer thosethat are established by some transition leading intothe state.As we mentioned earlier, the planner must concernitself with safety as well as goal achievement. One placewhere this di erence becomes signi cant is when back-tracking from a bad state (a state is bad if it has anunpreemptable path to the failure state). In this case,the planner will work to avoid the failure. There aretwo ways to do this: either avoid actions that leadto the bad state or re ne the bad abstract state, todemonstrate that the sub-states in which the bad tran-sition(s) occur are not, in fact, reachable (for exam-ple, see Figure 6). Safety concerns also intrude whennone of the goal-directed actions available at a state arefast enough to preempt a transition that would lead tofailure. Safety is always the paramount consideration,causing the planner to choose an action not preferredby the heuristics in this case.Implementation Status & PreliminaryResultsThe prototype DAP planner is implemented and run-ning on a selection of example domains that were usedin the original CIRCA research. The DAP plannerreasons about safety preserving goals of avoidance andoptional goals of achievement in much the same wayas the original CIRCA planner, except that it does notyet consider the detailed temporal model necessary toensure failure preemption in all cases. Manual inspec-tion of the prototype's output plans shows that theyare very similar to the original planner's; the new plan-ner chooses the same actions for the same states, butdoes not yet correctly derive the timing requirementson all of those actions.Given these limitations, comparisons between thetwo planners are still only approximate. However, ini-tial results are dramatic. Figure 7 shows several rep-resentative cases, some with nearly an order of mag-nitude reduction in search space using DAP. In theEnumerated States Runtime (sec)Domain Original DAP Original DAPName Planner Planner Planner PlannerPuma 1826892221.13Xdemo 22890.430.08Puma 376168.590.09Puma 43307168.30.59BT 6770.080.04Puma 92124158.80.33Figure 7: The DAP planner dramatically reducesthe search space and time.Puma 1 domain, which is one of the largest problemsto which CIRCA has been applied, the DAP planner isable to nd signi cant structure in the domain that theoriginal CIRCA planner cannot exploit. For example,the DAP plan is able to describe all of the conditions inwhich to take the push-emergency-button action asa disjunction of just three abstract state descriptions,while the original CIRCA planner selects that actionfor 54 di erent fully-described states.The BT 6 domain is a small, hand-crafted prob-lem designed to force the original CIRCA planner tobacktrack through several decisions, thus exercising thebacktracking and worst-case state space enumeration ofthe planner. The domain has only one state feature, sothe DAP planner can nd no suitable abstraction andit makes the same backtracking moves as the originalplanner, yielding the same search-space performance.To date, this is the only domain in which the DAP tech-nique has not yielded any performance improvement.Other simple domains, such as the Xdemo 2 domain(which has only 5 state features), still contain enoughhidden structure that the DAP technique is able to ndand exploit feasible abstractions.Related WorkMany classical planning systems have used abstractionmethods to increase the e ciency of searching for plans(see (Kambhampati 1994) for a brief survey). However,these abstractions are typically used only as guides insearching for a plan; the system may not know thatits goals will actually be achieved by an abstract plan,and it will not be able to execute the abstracted opera-tors directly. Instead, traditional abstraction plannersmust eventually expand their current plans down tothe lowest level of detail, removing the abstraction toproduce a nal executable plan.In the DAP approach, which involves abstractiononly of state descriptions, abstract plans are exe-cutable, because the operators are always completelyspeci ed. This has two main advantages. First, the planning process can supply initial plans that preservesafety but might, on further re nement, do a better jobof goal achievement. Second, the planning process canterminate with an executable abstract plan, which ourresults have shown may be much smaller than the cor-responding plan expanded to precisely-de ned states.Dearden and Boutilier (1997) have developed an ab-stract planning algorithm for decision-theoretic plan-ning modeled as a Markov decision process (MDP).Their method is similar to the DAP approach in thatit involves aggregating states, but there are some dif-ferences. First, their method is not dynamic: aggrega-tion is performed using a prede ned set of \relevant"propositions, which is determined using Knoblock's ap-proach (Knoblock 1994). Second, their method is uni-form: the same propositions are relevant everywhere.The underlying model is also signi cantly di erentfrom CIRCA's: it does not model exogenous eventsor the timing required for real-time guarantees.Kabanza et al. (Kabanza, Barbeau, & St-Denis 1997)have developed a planning method for reactive agentsthat is similar to the original CIRCA. Their architec-ture di ers in emphasis, however. The NFAs it con-structs are \clocked:" they make transitions at timesthat are the least common denominator of all possibletransitions. This scheme will su er a state space explo-sion in domains where there is a wide range of possibletransition delays, like those to which CIRCA has beenapplied. Kabanza's group has concentrated on develop-ing a more exible notation for goals than those usedby CIRCA, but they do not make the same distinc-tion between safety and goal achievement. In previ-ous work, Godefroid and Kabanza (Godefroid & Ka-banza 1991) developed an abstraction technique basedon partial orders. Their results allow a system to ex-amine only a single ordering of independent actions,rather than enumerating all possible orderings. Unfor-tunately, these results are not immediately applicableto CIRCA, because their world model does not includeexogenous events. The more recent work by Kabanzaet al. (Kabanza, Barbeau, & St-Denis 1997) does in-clude exogenous events, but they do not seem to havecarried over the earlier abstraction concepts.Future DirectionsIn this paper, we have presented Dynamic AbstractionPlanning (DAP), an abstraction technique that we useto generate real-time control plans in the CIRCA sys-tem. This abstraction technique is signi cantly dif-ferent from others in preserving safety guarantees andin performing abstraction locally and dynamically. Inour experience, by automatically selecting the appro-priate level of abstraction at each step during the plan-ning process, DAP signi cantly reduces the size of thesearch space.The main next step in developing the DAP method-ology is to fully integrate the detailed temporal reason-ing that the current prototype omits. This will bringthe new planner onto equal footing with the originalCIRCA planner, and will allow more accurate compar-isons of the e ciency improvements gained by usingthe dynamic abstraction method.Acknowledgments This work was supported by theDefense Advanced Research Projects Agency undercontract DAAK60-94-C-0040-P0006. We thank the re-viewers for their helpful comments.ReferencesDearden, R., and Boutilier, C. 1997. Abstractionand approximate decision-theoretic planning. Arti -cial Intelligence 89(1{2):219{283.Godefroid, P., and Kabanza, F. 1991. An e cientreactive planner for synthesizing reactive plans. InProc. Nat'l Conf. on Arti cial Intelligence, 640{645.Goldman, R. P., and Boddy, M. S. 1994. Conditionallinear planning. In Proc. Second Int'l Conf. on Arti-cial Intelligence Planning Systems, 80{85.Kabanza, F.; Barbeau, M.; and St-Denis, R. 1997.Planning control rules for reactive agents. TechnicalReport 197, Comp. Sci. Dept., Univ. of Sherbrooke.Kambhampati, S. 1994. Re nement search as a unify-ing framework for analyzing planning algorithms. InProc. Fourth Int'l Conf. on Principles of KnowledgeRepresentation and Reasoning.Knoblock, C. A. 1994. Automatically generating ab-stractions for planning. Arti cial Intelligence 68:243{302.McDermott, D. 1996. A heuristic estimator for means-ends analysis in planning. In Proc. Third Int'l Conf.on Arti cial Intelligence Planning Systems, 142{149.Musliner, D. J.; Durfee, E. H.; and Shin, K. G. 1993.CIRCA: a cooperative intelligent real-time control ar-chitecture. IEEE Trans. Systems, Man, and Cyber-netics 23(6):1561{1574.Musliner, D. J.; Durfee, E. H.; and Shin, K. G. 1995.World modeling for the dynamic construction of real-time control plans. Arti cial Intelligence 74(1):83{127.Peot, M. A., and Smith, D. E. 1992. Conditional non-linear planning. In Proc. First Int'l Conf. on Arti cialIntelligence Planning Systems, 189{197.Sacerdoti, E. D. 1974. Planning in a hierarchy ofabstraction spaces. Arti cial Intelligence 5(2):115{135.
منابع مشابه
HPGP: An Abstraction-Based Framework for Decision-Theoretic Planning
This paper is a report on research towards the development of an abstraction-based framework for decision-theoretic planning. We make use of two planning approaches in the context of probabilistic planning: planning by abstraction and planning graphs. To create abstraction hierarchies our planner uses an adapted version of a hierarchical planner under uncertainty, and to search for plans, we pr...
متن کاملThe Use of Supervenience in Dynamic-world Planning
This paper describes the use of supervenience in integrating planning and reaction in complex, dynamic environments. Supervenience is a form of abstraction with affinities both to abstraction in AI planning systems and to partitioning schemes in hierarchical control systems. The use of supervenienee can be distilled to an easy-to-state constraint on the design of multilevel dynamic-world p "lan...
متن کاملFinite abstractions for hybrid systems with stable continuous dynamics
This paper outlines an abstraction process in which a particular class of hybrid automata with continuous dynamics that have parameterized positive limit sets, are being abstracted into finite transition systems. The limit sets with their corresponding attraction regions define preand post-conditions for the continuous dynamics, and determine the transitions in the discrete abstraction. An obse...
متن کاملUsing Component Abstraction for Automatic Generation of Macro-Actions
Despite major progress in AI planning over the last few years, many interesting domains remain challenging for current planners. This paper presents component abstraction, an automatic and generic technique that can reduce the complexity of an important class of planning problems. Component abstraction uses static facts in a problem definition to decompose the problem into linked abstract compo...
متن کاملOn the Recognition of Abstract Markov Policies
Abstraction plays an essential role in the way the agents plan their behaviours, especially to reduce the computational complexity of planning in large domains. However, the effects of abstraction in the inverse process – plan recognition – are unclear. In this paper, we present a method for recognising the agent’s behaviour in noisy and uncertain domains, and across multiple levels of abstract...
متن کاملRoad Network Modeling with Layered Abstraction for Path Discovery in Vehicle Navigation Systems
Algorithms for path planning have recently drawn renewed attention from researchers due to the advances of intelligent transportation applications. The difference on path planning between past and nowadays is the complexity of dynamic transportation network. This paper focuses on the road network modeling to speed up real time path discovery in dynamic and complex road networks. On the basis of...
متن کامل