Improving Real-Time Heuristic Search on Initially Unknown Maps
نویسندگان
چکیده
Real-time search methods allow an agent to perform pathfinding tasks in unknown environments. Some of these methods may plan several actions per planning step. We present a novel approach, where the number of planned actions per step depends on the quality of the heuristic values found in the lookahead. If, after inspecting the neighborhood of a state, its heuristic value changes, only one action is planned. When the heuristic values of all states in the lookahead do not change, several actions are planned. We provide experimental evidence of the benefits of this approach, with respect to other real-time algorithms, on existing benchmarks. Introduction Let us consider an agent who has to perform a path-finding task from a start position to a goal position in an environment that is initially unknown. The agent can only sense the surrounding area that is in range of its sensors. In addition, it remembers those areas that it has visited previously. An example of this task appears in Figure 1. This situation may happen in characters of real-time computer games (Bulitko & Lee 2006) and control in robotics (Koenig 2001). Off-line search methods, like A* (Hart, Nilsson, & Raphael 1968) are not appropriate for these tasks, because they require knowing the terrain in advance. Incremental versions of A*, like D*Lite (Koenig & Likhachev 2002), and real-time search methods (Korf 1990) allow an agent to solve this task, but of these two, only real-time search methods can perform the planning phase in a limited, short amount of time (a comparison between incremental versions of A* and real-time heuristic search appears in (Koenig 2004)). Real-time search interleaves planning and action execution phases in an on-line manner. In the planning phase the agent plans one or several actions, which are performed in the action execution phase. Real-time methods restrict the search to a small part of the state space that is around the current state. This part is called the local Copyright © 2007, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. space. The size of the local space is small and independent of the size of the complete state space, so that searching in the local space is feasible in the limited time of the planning phase. As a result, the agent determines how to move inside the local space and plans one or several actions, which are performed in the next action execution phase. The whole process iterates with new planning and action execution phases until a goal state is reached. At each step, this strategy computes the beginning of the trajectory from the current state to a goal state. Since search is limited to a small portion of the state space, there is no guarantee to produce an optimal global trajectory. However, some methods guarantee that after repeated executions on the same problem instance, the trajectory converges to an optimal path. To prevent cycling, real-time methods update the heuristic values associated with visited states. Initially, real-time search algorithms planned one action per planning step (Korf 1990). Nowadays, several actions can be planned at each planning step, actions that are sequentially performed after it (Koenig 2004), (Koenig & Likhachev 2006), (Bulitko & Lee 2006). There is some debate about the relative performance of planning one single action versus planning several actions per planning step, with the same amount of lookahead. If a single action is planned, the resulting action will probably be of greater quality than if several actions were planned. This may decrease the cost of the final solution. However, planning one action per step often increases the overall CPU time devoted to planning. In this paper we present an alternative approach: to take into account the quality of the heuristic found during lookahead. If we find any evidence that the heuristic is not accurate, we suggest taking a conservative approach planning one action only. Start Start Start
منابع مشابه
Learning in Real-Time Search: A Unifying Framework
Real-time search methods are suited for tasks in which the agent is interacting with an initially unknown environment in real time. In such simultaneous planning and learning problems, the agent has to select its actions in a limited amount of time, while sensing only a local part of the environment centered at the agent’s current location. Real-time heuristic search agents select actions using...
متن کاملIncremental Heuristic Search in Games: The Quest for Speed
Robot path-planning methods can be used in real-time computer games but then need to run fast to ensure that the game characters move in real time, an issue addressed by incremental heuristic search methods. In this paper, we demonstrate how one can speed up D* Lite, an incremental heuristic search method that implements planning with the freespace assumption to move game characters in initiall...
متن کاملAggressive Heuristic Depression Filling with LRTA
The capability of learning is one of most remarkable features of real-time search, which adapts to tasks where the agent is interacting with an initially unknown environment in real time. However, the performance of real-time learning heavily depends on the topography of the estimated problem space, because of the existence of heuristic depression. Areas of the state space that suffered from he...
متن کاملCombining Lookahead and Propagation in Real-Time Heuristic Search
Real-time search methods allow an agent to perform path-finding tasks in unknown environments. Some real-time heuristic search methods may plan several elementary moves per planning step, requiring lookahead greater than inspecting inmediate successors. Recently, the propagation of heuristic changes in the same planning step has been shown beneficial for improving the performance of these metho...
متن کاملAlgorithm Comparison for Moving Target Search CMPUT 651 Midterm Report
For moving target search algorithms, we are considering the case of heuristic search where the goal may change during the course of the search. One motivating application lies with navigation of an autonomous police vehicle chasing a villain [5] in a possibly initially unknown environment under real-time constraints. In this scenario, we would like to assure that the autonomous vehicle can quic...
متن کامل