نتایج جستجو برای: bellman

تعداد نتایج: 4956  

Journal: :J. Applied Mathematics 2013
Bin Zheng Qinghua Feng Fanwei Meng

In this work, we investigate some new Gronwall-Bellman type dynamic inequalities on time scales in two independent variables, which provide a handy tool in deriving explicit bounds on unknown functions in certain dynamic equations on time scales. The established results generalize the main results on integral inequalities for continuous functions in [1] and their corresponding discrete analysis...

2003
Chandeok Park Panagiotis Tsiotras

We present a numerical algorithm for solving the Hamilton-Jacobi Bellman equation using a successive Galerkin-wavelet projection scheme. According to this scheme, the so-called Generalized-HamiltonJacobi-Bellman (GHJB) equation is solved iteratively starting from a stabilizing solution. As basis function for the Galerkin projections we consider the antiderivatives of the well-known Daubechies’ ...

2010
Dotan Di Castro Shie Mannor

We consider the problem of reinforcement learning using function approximation, where the approximating basis can change dynamically while interacting with the environment. A motivation for such an approach is maximizing the value function fitness to the problem faced. Three errors are considered: approximation square error, Bellman residual, and projected Bellman residual. Algorithms under the...

2017
Dave Mount

All-Pairs Shortest Paths: Earlier, we saw that Dijkstra’s algorithm and the Bellman-Ford algorithm both solved the problem of computing shortest paths in graphs from a single source vertex. Suppose that we want instead to compute shortest paths between all pairs of vertices. We could do this applying either Dijkstra or Bellman-Ford using every vertex as a source, but today we will consider an a...

Journal: :SIAM J. Control and Optimization 2014
Guy Barles Ariela Briani Emmanuel Chasseigne

This article is a continuation of a previous work where we studied infinite horizon control problems for which the dynamic, running cost and control space may be different in two halfspaces of some euclidian space R . In this article we extend our results in several directions: (i) to more general domains; (ii) by considering finite horizon control problems; (iii) by weaken the controlability a...

Journal: :Automatica 2010
Chang-Hee Won Ronald W. Diersing Bei Kang

In statistical control, the cost function is viewed as a random variable and one optimizes the distribution of the cost function through the cost cumulants. We consider a statistical control problem for a control-affine nonlinear system with a nonquadratic cost function. Using the Dynkin formula, the Hamilton–Jacobi–Bellman equation for the nth cost moment case is derived as a necessary conditi...

Journal: :Journal of Machine Learning Research 2011
Marek Petrik Shlomo Zilberstein

Value function approximation methods have been successfully used in many applications, but the prevailing techniques often lack useful a priori error bounds. We propose a new approximate bilinear programming formulation of value function approximation, which employs global optimization. The formulation provides strong a priori guarantees on both robust and expected policy loss by minimizing spe...

Journal: :SIAM J. Numerical Analysis 2016
P. Azimzadeh Peter A. Forsyth

This work is motivated by numerical solutions to Hamilton-Jacobi-Bellman quasivariational inequalities (HJBQVIs) associated with combined stochastic and impulse control problems. In particular, we consider (i) direct control, (ii) penalized, and (iii) semi-Lagrangian discretization schemes applied to the HJBQVI problem. Scheme (i) takes the form of a Bellman problem involving an operator which ...

2011
Marek Petrik Shlomo Zilberstein

Existing value function approximation methods have been successfully used in many applications, but they often lack useful a priori error bounds. We propose a new approximate bilinear programming formulation of value function approximation, which employs global optimization. The formulation provides strong a priori guarantees on both robust and expected policy loss by minimizing specific norms ...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید