نتایج جستجو برای: policy iterations

تعداد نتایج: 276392  

2005
Paul Cull

Convergence is a central problem in both computer science and in population biology. Will a program terminate? Will a population go to an equilibrium? In general these questions are quite difficult – even unsolvable. In this paper we will concentrate on very simple iterations of the form

2007
VERA FISCHER

In this talk we will consider three properties of iterations with mixed (finite/countable) supports: iterations of arbitrary length preserve ω1, iterations of length ≤ ω2 over a model of CH have the א2-chain condition and iterations of length < ω2 over a model of CH do not increase the size of the continuum. Definition 1. Let Pκ be an iterated forcing construction of length κ, with iterands 〈Q̇α...

2007
VOLODYMYR NEKRASHEVYCH

A complete description of the iterated monodromy groups of postcritically finite backward polynomial iterations is given in terms of their actions on rooted trees and automata generating them. We describe an iterative algorithm for finding kneading automata associated with post-critically finite topological polynomials and discuss some open questions about iterated monodromy groups of polynomials.

2012
Kengy Barty

In this paper we are interested in the convergence analysis of the Stochastic Dual Dynamic Algorithm (SDDP) algorithm in a general framework, and regardless of whether the underlying probability space is discrete or not. We consider a convex stochastic control program not necessarily linear and the resulting dynamic programming equation. We prove under mild assumptions that the approximations o...

Journal: :CoRR 2016
William Montgomery Sergey Levine

Guided policy search algorithms can be used to optimize complex nonlinear policies, such as deep neural networks, without directly computing policy gradients in the high-dimensional parameter space. Instead, these methods use supervised learning to train the policy to mimic a “teacher” algorithm, such as a trajectory optimizer or a trajectory-centric reinforcement learning method. Guided policy...

2016
William H. Montgomery Sergey Levine

Guided policy search algorithms can be used to optimize complex nonlinear policies, such as deep neural networks, without directly computing policy gradients in the high-dimensional parameter space. Instead, these methods use supervised learning to train the policy to mimic a “teacher” algorithm, such as a trajectory optimizer or a trajectory-centric reinforcement learning method. Guided policy...

Journal: :Annals of Pure and Applied Logic 2009

Journal: :SIAM Journal on Matrix Analysis and Applications 2010

Journal: :Journal of Mathematical Analysis and Applications 2020

Journal: :The Journal of the Australian Mathematical Society. Series B. Applied Mathematics 1990

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید