Generalizing Hyper-heuristics via Apprenticeship Learning
نویسندگان
چکیده
An apprenticeship-learning-based technique is used as a hyperheuristic to generate heuristics for an online combinatorial problem. It observes and learns from the actions of a known-expert heuristic on small instances, but has the advantage of producing a general heuristic that works well on other larger instances. Specifically, we generate heuristic policies for online bin packing problem by using expert near-optimal policies produced by a hyper-heuristic on small instances, where learning is fast. The ”expert” is a policy matrix that defines an index policy, and the apprenticeship learning is based on observation of the action of the expert policy together with a range of features of the bin being considered, and then applying a k-means classification. We show that the generated policy often performs better than the standard best-fit heuristic even when applied to instances much larger than the training set.
منابع مشابه
Improving Performance of a Hyper-heuristic Using a Multilayer Perceptron for Vehicle Routing
A hyper-heuristic is a heuristic optimisation method which generates or selects heuristics (move operators) based on a set of components while solving a computationally difficult problem. Apprenticeship learning arises while observing the behaviour of an expert in action. In this study, we use a multilayer perceptron (MLP) as an apprenticeship learning algorithm to improve upon the performance ...
متن کاملGeneralizing Apprenticeship Learning across Hypothesis Classes
This paper develops a generalized apprenticeship learning protocol for reinforcementlearning agents with access to a teacher who provides policy traces (transition and reward observations). We characterize sufficient conditions of the underlying models for efficient apprenticeship learning and link this criteria to two established learnability classes (KWIK and Mistake Bound). We then construct...
متن کاملPhD Thesis Proposal: Human-Machine Collaborative Optimization via Apprenticeship Scheduling
Resource optimization in health care, manufacturing, and military operations requires the careful choreography of people and equipment to effectively fulfill the responsibilities of the profession. However, resource optimization is a computationally challenging problem, and poorly utilizing resources can have drastic consequences. Within these professions, there are human domain experts who are...
متن کاملApprenticeship learning with few examples
We consider the problem of imitation learning when the examples, provided by an expert human, are scarce. Apprenticeship Learning via Inverse Reinforcement Learning provides an efficient tool for generalizing the examples, based on the assumption that the expert’s policy maximizes a value function, which is a linear combination of state and action features. Most apprenticeship learning algorith...
متن کاملA Tensor-based Approach to Nurse Rostering
Hyper-heuristics are high level improvement search methodologies exploring space of heuristics [4]. According to [5], hyper-heuristics can be categorized in many ways. A hyper-heuristic either selects from a set of available low level heuristics or generates new heuristics from components of existing low level heuristics to solve a problem, leading to a distinction between selection and generat...
متن کامل