Policy Feedback for the Refinement of Learned Motion Control on a Mobile Robot
نویسندگان
چکیده
Motion control is fundamental to mobile robots, and the associated challenge in development can be assisted by the incorporation of execution experience to increase policy robustness. In this work, we present an approach that updates a policy learned from demonstration with human teacher feedback. We contribute advice-operators as a feedback form that provides corrections on state-action pairs produced during a learner execution, and Focused Feedback for Mobile Robot Policies (F3MRP) as a framework for providing feedback to rapidly-sampled policies. Both are appropriate for mobile robot motion control domains. We present a general feedback algorithm in which multiple types of feedback, including advice-operators, are provided through the F3MRP framework, and shown to improve policies initially derived from a set of behavior examples. A comparison to providing more behavior examples instead of more feedback finds data to be generated in different areas of the state and action spaces, and feedback to be more effective at improving policy performance while producing smaller datasets. B.D. Argall ( ) Depts. of Electrical Engineering & Computer Science and Physical Medicine & Rehabilitation, Northwestern University, 2145 Sheridan Road, Evanston, IL 60208, USA e-mail: [email protected] B. Browning The Robotics Institute, Carnegie Mellon University, 5000 Forbes Ave, Pittsburgh, PA 15213, USA e-mail: [email protected] M.M. Veloso Computer Science Department, Carnegie Mellon University, 5000 Forbes Ave, Pittsburgh, PA 15213, USA e-mail: [email protected]
منابع مشابه
Dynamic Load Carrying Capacity of Mobile-Base Flexible-Link Manipulators: Feedback Linearization Control Approach
This paper focuses on the effects of closed- control on the calculation of the dynamic load carrying capacity (DLCC) for mobile-base flexible-link manipulators. In previously proposed methods in the literature of DLCC calculation in flexible robots, an open-loop control scheme is assumed, whereas in reality, robot control is achieved via closed loop approaches which could render the calculated ...
متن کاملMobile Robot Motion Control from Demonstration and Corrective Feedback
Robust motion control algorithms are fundamental to the successful, autonomous operation of mobile robots. Motion control is known to be a difficult problem, and is often dictated by a policy, or state-action mapping. In this chapter, we present an approach for the refinement of mobile robot motion control policies, that incorporates corrective feedback from a human teacher. The target applicat...
متن کاملTeacher feedback to scaffold and refine demonstrated motion primitives on a mobile robot
Task demonstration is an effective technique for developing robot motion control policies. As tasks becomemore complex, however, demonstration can becomemore difficult. In this work, we introduce an algorithm that uses corrective human feedback to build a policy able to performanovel task, by combining simpler policies learned from demonstration. While some demonstration-based learning approach...
متن کاملDirect Optimal Motion Planning for Omni-directional Mobile Robots under Limitation on Velocity and Acceleration
This paper describes a low computational direct approach for optimal motion planning and obstacle avoidance of Omni-directional mobile robots within velocity and acceleration constraints on the robot motion. The main purpose of this problem is the minimization of a quadratic cost function while limitation on velocity and acceleration of robot is considered and collision with any obstacle in the...
متن کاملLearning Mobile Robot Motion Control from Demonstrated Primitives and Human Feedback
Task demonstration is one effective technique for developing robot motion control policies. As tasks become more complex, however, demonstration can become more difficult. In this work we introduce a technique that uses corrective human feedback to build a policy able to perform an undemonstrated task from simpler policies learned from demonstration. Our algorithm first evaluates and corrects t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- I. J. Social Robotics
دوره 4 شماره
صفحات -
تاریخ انتشار 2012