policy iterations

Convergence of Iterations

2005

Paul Cull

Convergence is a central problem in both computer science and in population biology. Will a program terminate? Will a population go to an equilibrium? In general these questions are quite difficult – even unsolvable. In this paper we will concentrate on very simple iterations of the form

متن کامل

Iterations with Mixed Support

2007

VERA FISCHER

In this talk we will consider three properties of iterations with mixed (finite/countable) supports: iterations of arbitrary length preserve ω1, iterations of length ≤ ω2 over a model of CH have the א2-chain condition and iterations of length < ω2 over a model of CH do not increase the size of the continuum. Definition 1. Let Pκ be an iterated forcing construction of length κ, with iterands 〈Q̇α...

متن کامل

Combinatorics of Polynomial Iterations

2007

VOLODYMYR NEKRASHEVYCH

A complete description of the iterated monodromy groups of postcritically finite backward polynomial iterations is given in terms of their actions on rooted trees and automata generating them. We describe an iterative algorithm for finding kneading automata associated with post-critically finite topological polynomials and discuss some open questions about iterated monodromy groups of polynomials.

متن کامل

A note on the convergence of the SDDP algorithm

2012

Kengy Barty

In this paper we are interested in the convergence analysis of the Stochastic Dual Dynamic Algorithm (SDDP) algorithm in a general framework, and regardless of whether the underlying probability space is discrete or not. We consider a convex stochastic control program not necessarily linear and the resulting dynamic programming equation. We prove under mild assumptions that the approximations o...

متن کامل

Guided Policy Search as Approximate Mirror Descent

Journal: :CoRR 2016

William Montgomery Sergey Levine

Guided policy search algorithms can be used to optimize complex nonlinear policies, such as deep neural networks, without directly computing policy gradients in the high-dimensional parameter space. Instead, these methods use supervised learning to train the policy to mimic a “teacher” algorithm, such as a trajectory optimizer or a trajectory-centric reinforcement learning method. Guided policy...

متن کامل

Guided Policy Search via Approximate Mirror Descent

2016

William H. Montgomery Sergey Levine

Guided policy search algorithms can be used to optimize complex nonlinear policies, such as deep neural networks, without directly computing policy gradients in the high-dimensional parameter space. Instead, these methods use supervised learning to train the policy to mimic a “teacher” algorithm, such as a trajectory optimizer or a trajectory-centric reinforcement learning method. Guided policy...

متن کامل

Convergence of Iterations

Iterations with Mixed Support

Combinatorics of Polynomial Iterations

A note on the convergence of the SDDP algorithm

Guided Policy Search as Approximate Mirror Descent

Guided Policy Search via Approximate Mirror Descent

Some applications of mixed support iterations

Approximate Nullspace Iterations for KKT Systems

Warped proximal iterations for monotone inclusions

Monotone iterations for nonlinear obstacle problem