Linear stochastic approximation driven by slowly varying Markov chains

نویسندگان

  • Vijay R. Konda
  • John N. Tsitsiklis
چکیده

We study a linear stochastic approximation algorithm that arises in the context of reinforcement learning. The algorithm employs a decreasing step-size, and is driven by Markov noise with time-varying statistics. We show that under suitable conditions, the algorithm can track the changes in the statistics of the Markov noise, as long as these changes are slower than the rate at which the step-size of the algorithm goes to zero. c © 2003 Elsevier B.V. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Approximating Queues in Slowly Varying Station- Ary Environments

We provide linear approximations to the marginal distributions for a class of infinite-state continuous-time stationary Markov chains in slowly varying environents. We take an approach motivated by light-traffic approximations to stationary point processes, which permits us to consider general stationary environments. Under mild assumptions we show that Jackson networks with routing not affecte...

متن کامل

Markov Chains Approximation of Jump-Diffusion Quantum Trajectories

“Quantum trajectories” are solutions of stochastic differential equations also called Belavkin or Stochastic Schrödinger Equations. They describe random phenomena in quantum measurement theory. Two types of such equations are usually considered, one is driven by a one-dimensional Brownian motion and the other is driven by a counting process. In this article, we present a way to obtain more adva...

متن کامل

On Markov Chain Approximations to Semilinear Partial Differential Equations Driven by Poisson Measure Noise

We consider the stochastic model of water pollution, which mathematically can be written with a stochastic partial differential equation driven by Poisson measure noise. We use a stochastic particle Markov chain method to produce an implementable approximate solution. Our main result is the annealed law of large numbers establishing convergence in probability of our Markov chains to the solutio...

متن کامل

Persistent tracking and identification of regime-switching systems with structural uncertainties: unmodeled dynamics, observation bias, and nonlinear model mismatch

This work focuses on tracking and system identification of systems with regime-switching parameters, which are modeled by a Markov process. It introduces a framework for persistent identification problems that encompass many typical system uncertainties, including parameter switching, stochastic observation disturbances, deterministic unmodeled dynamics, sensor observation bias, and nonlinear m...

متن کامل

Stochastic Dynamic Programming with Markov Chains for Optimal Sustainable Control of the Forest Sector with Continuous Cover Forestry

We present a stochastic dynamic programming approach with Markov chains for optimal control of the forest sector. The forest is managed via continuous cover forestry and the complete system is sustainable. Forest industry production, logistic solutions and harvest levels are optimized based on the sequentially revealed states of the markets. Adaptive full system optimization is necessary for co...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Systems & Control Letters

دوره 50  شماره 

صفحات  -

تاریخ انتشار 2003