Basis Construction from Power Series Expansions of Value Functions

نویسندگان

  • Sridhar Mahadevan
  • Bo Liu
چکیده

This paper explores links between basis construction methods in Markov decision processes and power series expansions of value functions. This perspective provides a useful framework to analyze properties of existing bases, as well as provides insight into constructing more effective bases. Krylov and Bellman error bases are based on the Neumann series expansion. These bases incur very large initial Bellman errors, and can converge rather slowly as the discount factor approaches unity. The Laurent series expansion, which relates discounted and average-reward formulations, provides both an explanation for this slow convergence as well as suggests a way to construct more efficient basis representations. The first two terms in the Laurent series represent the scaled average-reward and the average-adjusted sum of rewards, and subsequent terms expand the discounted value function using powers of a generalized inverse called the Drazin (or group inverse) of a singular matrix derived from the transition matrix. Experiments show that Drazin bases converge considerably more quickly than several other bases, particularly for large values of the discount factor. An incremental variant of Drazin bases called Bellman average-reward bases (BARBs) is described, which provides some of the same benefits at lower computational cost.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Eigenfunction Expansions for Second-Order Boundary Value Problems with Separated Boundary Conditions

In this paper, we investigate some properties of eigenvalues and eigenfunctions of boundary value problems with separated boundary conditions. Also, we obtain formal series solutions for some partial differential equations associated with the second order differential equation, and study necessary and sufficient conditions for the negative and positive eigenvalues of the boundary value problem....

متن کامل

Nonharmonic Gabor Expansions

We consider Gabor systems generated by a Gaussian function and prove certain classical results of Paley and Wiener on nonharmonic Fourier series of complex exponentials for the Gabor expansion‎. ‎In particular, we prove a version of Plancherel-Po ́lya theorem for entire functions with finite order of growth and use the Hadamard factorization theorem to study regularity‎, ‎exactness and deficienc...

متن کامل

New method to obtain small parameter power series expansions of Mathieu radial and angular functions

Small parameter power series expansions for both radial and angular Mathieu functions are derived. The expansions are valid for all integer orders and apply the Stratton-Morse-Chu normalization. Three new contributions are provided: (1) explicit power series expansions for the radial functions, which are not available in the literature; (2) improved convergence rate of the power series expansio...

متن کامل

Nonlinear Finite Element Analysis of Bending of Straight Beams Using hp-Spectral Approximations

Displacement finite element models of various beam theories have been developed using traditional finite element interpolations (i.e., Hermite cubic or equi-spaced Lagrange functions). Various finite element models of beams differ from each other in the choice of the interpolation functions used for the transverse deflection w, total rotation φ and/or shear strain γxz, or in the integral form u...

متن کامل

A novel method based on a combination of deep learning algorithm and fuzzy intelligent functions in order to classification of power quality disturbances in power systems

Automatic classification of power quality disturbances is the foundation to deal with power quality problem. From the traditional point of view, the identification process of power quality disturbances should be divided into three independent stages: signal analysis, feature selection and classification. However, there are some inherent defects in signal analysis and the procedure of manual fe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010