Intelligent Control of a Sensor-Actuator System via Kernelized Least-Squares Policy Iteration

نویسندگان

  • Bo Liu
  • Sanfeng Chen
  • Shuai Li
  • Yongsheng Liang
چکیده

In this paper a new framework, called Compressive Kernelized Reinforcement Learning (CKRL), for computing near-optimal policies in sequential decision making with uncertainty is proposed via incorporating the non-adaptive data-independent Random Projections and nonparametric Kernelized Least-squares Policy Iteration (KLSPI). Random Projections are a fast, non-adaptive dimensionality reduction framework in which high-dimensionality data is projected onto a random lower-dimension subspace via spherically random rotation and coordination sampling. KLSPI introduce kernel trick into the LSPI framework for Reinforcement Learning, often achieving faster convergence and providing automatic feature selection via various kernel sparsification approaches. In this approach, policies are computed in a low-dimensional subspace generated by projecting the high-dimensional features onto a set of random basis. We first show how Random Projections constitute an efficient sparsification technique and how our method often converges faster than regular LSPI, while at lower computational costs. Theoretical foundation underlying this approach is a fast approximation of Singular Value Decomposition (SVD). Finally, simulation results are exhibited on benchmark MDP domains, which confirm gains both in computation time and in performance in large feature spaces.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Manifold Regularization for Kernelized LSTD

Policy evaluation or value function or Q-function approximation is a key procedure in reinforcement learning (RL). It is a necessary component of policy iteration and can be used for variance reduction in policy gradient methods. Therefore its quality has a significant impact on most RL algorithms. Motivated by manifold regularized learning, we propose a novel kernelized policy evaluation metho...

متن کامل

An Adaptive Weighted Fuzzy Controller Applied on Quality of Service of Intelligent 5G Environments

in computational intelligence area, it is suitable to fulfill the analysis in order to interpret the concept and sources of uncertainty and the conditions of its incidence, and hence pursuit for reliable techniques of dealing with it. Dealing with uncertainties in this case is a challenging and multidisciplinary activity. So, there is a need for a capable tool for modeling, control, and analyti...

متن کامل

Self-tuning Control of an Electro-hydraulic Actuator System

Due to time-varying effects in electro-hydraulic actuator (EHA) system parameters, a selftuning control algorithm using pole placement and recursive identification is presented. A discrete-time model is developed using system identification method to represent the EHA system and residual analysis is used for model validation. A recursive least square (RLS) method with covariance resetting techn...

متن کامل

Robust Control of a Quadrotor in the Presence of Actuators' Failure

Today, robots and unmanned aerial vehicles are being used extensively in modern societies. Due to a wide range of applications, it has attracted much attention among scientists over the past decades. This paper deals with the problem of the stability of a four-rotor flying robot called quadrotor, which is an under-actuated system, in the presence of operator or sensor failures. The dynamica...

متن کامل

Oscillation Control of Aircraft Shock Absorber Subsystem Using Intelligent Active Performance and Optimized Classical Techniques Under Sine Wave Runway Excitation (TECHNICAL NOTE)

This paper describes third aircraft model with 2 degrees of freedom. The aim of this study is to develop a mathematical model for investigation of adoptable landing gear vibration behavior and to design Proportional Integration Derivative (PID) classical techniques for control of active hydraulic nonlinear actuator. The parameters of controller and suspension system are adjusted according to be...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 12  شماره 

صفحات  -

تاریخ انتشار 2012