Learning Skill Templates for Parameterized Tasks
نویسندگان
چکیده
We consider the problem of learning skill templates for a parameterized reinforcement learning problem class T. That is, we assume that a task, i.e., an instance of the problem class, is defined by a task parameter vector τ ∈ T ⊆ R n and an associated interpretation. Likewise, a skill is considered as a parameterized policy with parameter vector θ ∈ R m. A parameterized skill [1] is a mapping Θ from task vector τ to a skill vector θ τ , i.e., Θ : τ → θ τ. Let J(θ, τ) be the expected return of the skill parametrized by θ in task τ ; the goal of parameterized skill learning is to find a mapping Θ * such that Θ * = arg max Θ P (τ)J(Θ(τ), τ)dτ , where P (τ) is the task distribution. Because the parametrized skill Θ will typically not predict the optimal θ * τ = arg max θ J(θ, τ), it is desirable to not only learn a point-estimate of θ * τ but also to give a measure of uncertainty of this prediction. We propose to learn a so-called skill template Ψ = (Θ, Ω), which contains a function Ω : τ → Σ τ with Σ τ ∈ R m×m that provides this uncertainty. Σ τ can be interpreted as the covariance of a Gaussian distribution over the skill's parameter space. Thus, a skill template Ψ can be seen as a mapping from a task to a Gaussian distribution over the skill parameter space, with Θ predicting the distribution's mean and Ω predicting the distribution's covariance. Skill templates are learned based on a set of skill weights that have been learned for specific task instances. Let E = {(τ i , θ τ i)|i = 1,. .. , K} be a training set consisting of experience collected in K tasks with J(θ τ i , τ i) ≈ J(θ * τ i , τ i). Learning the parameterized skill Θ can be considered as a regression problem, trained with the pairs in E. While da Silva et al. [1] used Support Vector Regression for this regression task, we use Gaussian Process Regression (GPR) since it naturally provides an uncertainty along with each prediction. Different ways of learning Ω from E are imaginable; in this abstract, we only consider the case of diagonal Σ τ with Σ τ either being a …
منابع مشابه
Active Learning of Parameterized Skills
We introduce a method for actively learning parameterized skills. Parameterized skills are flexible behaviors that can solve any task drawn from a distribution of parameterized reinforcement learning problems. Approaches to learning such skills have been proposed, but limited attention has been given to identifying which training tasks allow for rapid skill acquisition. We construct a non-param...
متن کاملClustering via Dirichlet Process Mixture Models for Portable Skill Discovery
Skill discovery algorithms in reinforcement learning typically identify single states or regions in state space that correspond to task-specific subgoals. However, such methods do not directly address the question of how many distinct skills are appropriate for solving the tasks that the agent faces. This can be highly inefficient when many identified subgoals correspond to the same underlying ...
متن کاملThe Impact of Skill Integration on Task Involvement Load
The present study investigated whether word learning and retention in a second language are contingent upon a task's involvement load, i.e., the amount of need, search, and evaluation the task imposes. Laufer and Hulstijn (2001) contend that tasks with higher degrees of these three components induce higher involvement load, and are, therefore, more effective for word learning. To test this clai...
متن کاملLearning Robot Skill Embeddings
We present a method for reinforcement learning of closely related skills that are parameterized via a skill embedding space. We learn such skills by taking advantage of latent variables and exploiting a connection between reinforcement learning and variational inference. The main contribution of our work is an entropyregularized policy gradient formulation for hierarchical policies, and an asso...
متن کاملRepresentation of robot motion control skill
Development of skilled robotics draws clues from model-based theories of human motor control. Thus, a comprehensive anthropomorphic background is given in the introductory part of the paper. Skills in robotics are viewed as a tool for fast and efficient real-time control that can handle complexity and nonlinearity of robots, generally aiming at robot autonomy. Particularly, a skill of redundanc...
متن کامل