Newton based Stochastic Optimization using q-Gaussian Smoothed Functional Algorithms
نویسندگان
چکیده
We present the first q-Gaussian smoothed functional (SF) estimator of the Hessian and the first Newton-based stochastic optimization algorithm that estimates both the Hessian and the gradient of the objective function using q-Gaussian perturbations. Our algorithm requires only two system simulations (regardless of the parameter dimension) and estimates both the gradient and the Hessian at each update epoch using these. We also present a proof of convergence of the proposed algorithm. In a related recent work (Ghoshdastidar et al., 2013), we presented gradient SF algorithms based on the q-Gaussian perturbations. Our work extends prior work on smoothed functional algorithms by generalizing the class of perturbation distributions as most distributions reported in the literature for which SF algorithms are known to work and turn out to be special cases of the q-Gaussian distribution. Besides studying the convergence properties of our algorithm analytically, we also show the results of several numerical simulations on a model of a queuing network, that illustrate the significance of the proposed method. In particular, we observe that our algorithm performs better in most cases, over a wide range of q-values, in comparison to Newton SF algorithms with the Gaussian (Bhatnagar, 2007) and Cauchy perturbations, as well as the gradient q-Gaussian SF algorithms (Ghoshdastidar et al., 2013).
منابع مشابه
Smoothed Action Value Functions
State-action value functions (i.e., Q-values) are ubiquitous in reinforcement learning (RL), giving rise to popular algorithms such as SARSA and Q-learning. We propose a new notion of action value defined by a Gaussian smoothed version of the expected Q-value. We show that such smoothed Q-values still satisfy a Bellman equation, making them learnable from experience sampled from an environment....
متن کاملSmoothed Action Value Functions
State-action value functions (i.e., Q-values) are ubiquitous in reinforcement learning (RL), giving rise to popular algorithms such as SARSA and Q-learning. We propose a new notion of action value defined by a Gaussian smoothed version of the expected Q-value. We show that such smoothed Q-values still satisfy a Bellman equation, making them learnable from experience sampled from an environment....
متن کاملSmoothed Action Value Functions for Learning Gaussian Policies
State-action value functions (i.e., Q-values) are ubiquitous in reinforcement learning (RL), giving rise to popular algorithms such as SARSA and Qlearning. We propose a new notion of action value defined by a Gaussian smoothed version of the expected Q-value. We show that such smoothed Q-values still satisfy a Bellman equation, making them learnable from experience sampled from an environment. ...
متن کاملSmoothed Functional and Quasi-Newton Algorithms for Routing in Multi-stage Queueing Network with Constraints
We consider the problem of optimal routing in a multi-stage network of queues with constraints on queue lengths. We develop three algorithms for probabilistic routing for this problem using only the total end-to-end delays. These algorithms use the smoothed functional (SF) approach to optimize the routing probabilities. In our model all the queues are assumed to have constraints on the average ...
متن کاملBlind Deconvolution Using the Relative Newton Method
We propose a relative optimization framework for quasi maximum likelihood blind deconvolution and the relative Newton method as its particular instance. Special Hessian structure allows its fast approximate construction and inversion with complexity comparable to that of gradient methods. The use of rational IIR restoration kernels provides a richer family of filters than the traditionally used...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Automatica
دوره 50 شماره
صفحات -
تاریخ انتشار 2014