Choosing Search Heuristics by Non-Stationary Reinforcement Learning
نویسنده
چکیده
Search decisions are often made using heuristic methods because realworld applications can rarely be tackled without any heuristics. In many cases, multiple heuristics can potentially be chosen, and it is not clear a priori which would perform best. In this article, we propose a procedure that learns, during the search process, how to select promising heuristics. The learning is based on weight adaptation and can even switch between different heuristics during search. Different variants of the approach are evaluated within a constraint-programming environment.
منابع مشابه
Learning Optimal Parameter Values in Dynamic Environment: An Experiment with Softmax Reinforcement Learning Algorithm
1. Introduction Many learning and heuristic search algorithms require tuning of parameters to achieve optimum performance. In stationary and deterministic problem domains this is usually achieved through off-line sensitivity analysis. However, this method breaks down in non-stationary and non-deterministic environments, where the optimal set of values for the parameters keep changing over time....
متن کاملA Greedy Divide-and-Conquer Approach to Optimizing Large Manufacturing Systems using Reinforcement Learning
Manufacturing is a challenging real-world domain for studying hierarchical MDP-based optimization algorithms. We have recently obtained very promising results using a hierarchical reinforcement learning based optimization algorithm for a 12-machine transfer line. Transfer lines model factory processes in automobile and many other product assembly plants. Unlike domains such as elevator scheduli...
متن کاملAn Improved Choice Function Heuristic Selection for Cross Domain Heuristic Search
Hyper-heuristics are a class of high-level search technologies to solve computationally difficult problems which operate on a search space of low-level heuristics rather than solutions directly. A iterative selection hyper-heuristic framework based on single-point search relies on two key components, a heuristic selection method and a move acceptance criteria. The Choice Function is an elegant ...
متن کاملAddressing Environment Non-Stationarity by Repeating Q-learning Updates
Q-learning (QL) is a popular reinforcement learning algorithm that is guaranteed to converge to optimal policies in Markov decision processes. However, QL exhibits an artifact: in expectation, the effective rate of updating the value of an action depends on the probability of choosing that action. In other words, there is a tight coupling between the learning dynamics and underlying execution p...
متن کاملUsing Case Based Heuristics to Speed up Reinforcement Learning
The aim of this work is to combine three successful AI techniques –Reinforcement Learning (RL), Heuristics Search and Case Based Reasoning (CBR)– creating a new algorithm that allows the use of cases in a case base as heuristics to speed up Reinforcement Learning algorithms. This approach, called Case Based Heuristically Accelerated Reinforcement Learning (CB-HARL), builds upon an emerging tech...
متن کامل