A Reinforcement Learning Approach for Goal-directed Locomotion of a Complex Snake-Like Robot
نویسندگان
چکیده
Snake-like robots conventionally use undulation movements for locomotion [1], [2]. In contrast, a nonstandard snake-like robot having four screw-drive units which are connected serially by three active joints [3] uses propulsion by rotating its screw-drive units to move. Employing this screw-drive mechanism, undulation is not required for its locomotion. This helps it to move in narrow spaces. However, generating its goal-directed locomotion is a challenging control problem. In this work, we tackle the problem by using a reinforcement learning approach, called Policy Improvement using Path Integrals (PI2) [4]. PI2 is numerically simple and has an ability to deal with high dimensional systems. Here, it is used as a model-free learning mechanism to find a proper combination of seven locomotion control parameters (three yaw joint angles and four screw unit angular velocities) of the robot for moving towards a given goal. The learning process is achieved using simulation and the learned parameters are successfully transferred to the real robot. Experiments show that the robot with different body configurations (like, straight-line, zigzag, arc, etc.) can effectively move toward any given goal (see supplementary video at http://www.manoonpong.com /BCCN2014/SnakeRobot.wmv). In this way, a large repertoire of robot behaviors are obtained and used as motor primitives for generating other new behaviors online. By selecting different primitives and properly chaining them along with parameter interpolation and/or sensory feedback techniques, the robot can successfully handle complex tasks like, reaching a single goal or multiple goals while avoiding obstacles and compensating to a change of its body shape. Taken together this study suggests how the PI2 reinforcement learning approach can be used to solve coordination problem of many degrees-of-freedom systems, like the nonstandard snake-like robot, and to generate goal-directed locomotion in a complex environment.
منابع مشابه
Reinforcement Learning with PI2 Algorithm to Generate Motor Primitives of a Complex Snake-Like Robot
In this thesis work a policy improvement algorithm called Policy Improvement with Path Integrals (PI2) is used to generate goal-directed locomotion of a complex snake-like robot with screw-drive units. PI2 is numerically simple and has an ability to deal with high dimensional systems. Here, PI2 is used to find proper locomotion control parameters, like joint angles and screw-drive unit velociti...
متن کاملMultiple-objective Optimization of Serpentine Locomotion with Snake Robot by Using the NSGA
This paper starts with developing kinematic and dynamic model of a snake shape robot in serpentine locomotion and finishes with actual experimentation. At the beginning the symmetrical and unsymmetrical serpenoid curves are introduced. Kinematics and dynamics of a snake robot on flat and inclined surfaces are obtained for a general n-link robot. SimMechanics toolbox of MATLAB software is employ...
متن کاملArtificial Consciousness for Improving Reinforcement Learning
Reinforcement learning methods are useful for robot learning, but become slow when robots possess many degrees of freedom. We suggest equipping robots with fast on-board simulators, in order to accelerate learning. Such simulators will resemble forms of consciousness, enabling the robots to perform run-time trials in a simulated world, rather than tediously performing them in practice. We have ...
متن کاملExtended QDSEGA for controlling real robots -acquisition of locomotion patterns for snake-like robot
Reinforcement learning is very effec#ive for robot learning. Because it does not need prior knowledge and has higher capability of reactive and adaptive behaviors. In our previous works, we proposed new reinforce learning algorithm: "Q-learning with Dynamic Structuring of Exploration Space Based on Genetic Algorithm (QDSEGA)". It is designed for complicated systems with large action-state space...
متن کاملDynamic Obstacle Avoidance by Distributed Algorithm based on Reinforcement Learning (RESEARCH NOTE)
In this paper we focus on the application of reinforcement learning to obstacle avoidance in dynamic Environments in wireless sensor networks. A distributed algorithm based on reinforcement learning is developed for sensor networks to guide mobile robot through the dynamic obstacles. The sensor network models the danger of the area under coverage as obstacles, and has the property of adoption o...
متن کامل