Intelligents et de Robotique Hierarchical & Factored Reinforcement Learning

نویسندگان

  • Olga KOZLOVA
  • Alexey Kozlov
چکیده

This thesis is accomplished in the context of the industrial simulation domain that addresses the problems of modelling of human behavior in military training and civil security simulations. The aim of this work is to solve large stochastic and sequential decision making problems in the Markov Decision Process (MDP) framework using Reinforcement Learning methods for learning and planning under uncertainty. The Factored Markov Decision Process (FMDP) framework is a standard representation for sequential decision problems under uncertainty where the state is represented as a collection of random variables. Factored Reinforcement Learning (FRL) is an Modelbased Reinforcement Learning approach to FMDPs where the transition and reward functions of the problem are learned under a factored form. As a first contribution of this thesis, we show how to model in a theoretically well-founded way the problems where some combinations of state variable values may not occur, giving rise to what we call impossible states. Furthermore, we propose a new heuristics that considers as impossible the states that have not been seen so far. We derive an algorithm whose improvement in performance with respect to the standard approach is illustrated through benchmark experiments on MAZE6 and BLOCKS WORLD problems. Besides, following the example of FMDPs, a Hierarchical MDP (HMDP) is based on the idea of factorization, but brings that idea on a new level. From state factorization of FMDPs, HMDP can make profit of task factorization, where a set of similar situations (defined by their goals) are represented by a partially defined set of independent subtasks. In other words, it is possible to simplify a problem by splitting it into smaller problems that are easier to solve individually, but also reuse the subtasks in order to speed up the global search of a solution. This kind of architecture can be eficiently represented using the options framework by including temporally extended courses of actions. The second contribution of this thesis introduces TeXDYNA, an algorithm designed to solve large MDPs with unknown structure by integrating hierarchical abstraction techniques of Hierarchical Reinforcement Learning (HRL) and factorization techniques

منابع مشابه

Robot initiative increases the rhythm of interaction in a team learning task

We hypothesize that the initiative of a robot during a collaborative task with a human can influence the pace of interaction and the reaction time of the human response to attention cues. We designed a two-phases object learning experiment where the human teaches the robot about the properties of some objects. We compare the effect of the initiator of the task in the teaching phase (human or ro...

متن کامل

Ludovic Saint - Bauzel

1Department of Human and Information Systems, Faculty of Engineering, Gifu University, 1-1 Yanagido, Gifu 501-1193, Japan 2Mechanical Engineering, School of Engineering, University of North Florida, 1 UNF Drive, Jacksonville, FL 32224, USA 3Department of Mechanical Engineering, Sogang University, Mapoku, Seoul 121-742, Republic of Korea 4 Institut des Systèmes Intelligents et de Robotique, Univ...

متن کامل

Rapid Prediction of Biomechanical

10 Groupe de recherche sur le système nerveux central (GRSNC) 11 Département de Neuroscience, Université de Montréal 12 Montréal (QC), CANADA 13 14 Cognition and Action Laboratory 15 Institute of Neuroscience, Université Catholique de Louvain 16 1200 Brussels, BELGIUM 17 18 UPMC, Univ Paris 06, UMR 7222, ISIR 19 4 Place Jussieu, 75005 Paris, FRANCE 20 21 CRNS, UMR 7222, ISIR 22 4 Place Jussieu,...

متن کامل

Reasoning about humans and its use in a cognitive control architecture for a collaborative robot

Rachid Alami, Aurélie Clodic CNRS, LAAS, 7 avenue du colonel Roche, F 31400 Toulouse, France Univ de Toulouse, LAAS, F 31400 Toulouse, France [email protected], [email protected] Raja Chatila Institut des Systèmes Intelligents et de Robotique Université Pierre et Marie Curie-Paris6, CNRS UMR 7222 4, Place Jussieu 75252 Paris Cedex 05 France [email protected] Séverin Lemaignan Co...

متن کامل

TeXDYNA: Hierarchical Reinforcement Learning in Factored MDPs

Reinforcement learning is one of the main adaptive mechanisms that is both well documented in animal behaviour and giving rise to computational studies in animats and robots. In this paper, we present TeXDYNA, an algorithm designed to solve large reinforcement learning problems with unknown structure by integrating hierarchical abstraction techniques of Hierarchical Reinforcement Learning and f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

متن کامل
عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010