policy iterations

نتایج جستجو برای: policy iterations

تعداد نتایج: 276392 فیلتر نتایج به سال:

An Empirical Study of the Workload Distribution Under Static Scheduling

1994

Zhiyuan Li Trung N. Nguyen

{ In the decision regarding static scheduling vs. dynamic scheduling, the only argument against the former is the potential imbalance of the workload. However, it has never been clear how the workload distributes in the iterations of Fortran parallel loops. This work examines a set of Perfect benchmarking programs 2] and report two striking results. First, when using operation counts as the mea...

متن کامل

Convergence Analysis of Policy Iteration

Journal: :CoRR 2015

Ali Heydari

Adaptive optimal control of nonlinear dynamic systems with deterministic and known dynamics under a known undiscounted infinite-horizon cost function is investigated. Policy iteration scheme initiated using a stabilizing initial control is analyzed in solving the problem. The convergence of the iterations and the optimality of the limit functions, which follows from the established uniqueness o...

متن کامل

Accelerating of Modified Policy Iteration in Probabilistic Model Checking

2016

Mohammadsadegh Mohagheghi

Markov Decision Processes (MDPs) are used to model both non-deterministic and probabilistic systems. Probabilistic model checking is an approach for verifying quantitative properties of probabilistic systems that are modeled by MDPs. Value and Policy Iteration and modified version of them are well-known approaches for computing a wide range of probabilistic properties. This paper tries to impro...

متن کامل

Multi-Agent Planning with Baseline Regret Minimization

2017

Feng Wu Shlomo Zilberstein Xiaoping Chen

We propose a novel baseline regret minimization algorithm for multi-agent planning problems modeled as finite-horizon decentralized POMDPs. It guarantees to produce a policy that is provably at least as good as a given baseline policy. We also propose an iterative belief generation algorithm to efficiently minimize the baseline regret, which only requires necessary iterations so as to converge ...

متن کامل

An Efficient Algorithm for Computing Optimal (s, S) Policies

Journal: :Operations Research 1984

Awi Federgruen Paul H. Zipkin

This paper presents an algorithm to compute an optimal (s, S) policy under standard assumptions (stationary data, well-behaved one-period costs, discrete demand, full backlogging, and the average-cost criterion). The method is iterative, starting with an arbitrary, given (s, S) policy and converging to an optimal policy in a finite number of iterations. Any of the available approximations can t...

متن کامل

Stochastic Primal-Dual Methods and Sample Complexity of Reinforcement Learning

Journal: :CoRR 2016

Yichen Chen Mengdi Wang

We study the online estimation of the optimal policy of a Markov decision process (MDP). We propose a class of Stochastic Primal-Dual (SPD) methods which exploit the inherent minimax duality of Bellman equations. The SPD methods update a few coordinates of the value and policy estimates as a new state transition is observed. These methods use small storage and has low computational complexity p...

متن کامل

Composing Meta-Policies for Autonomous Driving Using Hierarchical Deep Reinforcement Learning

Journal: :CoRR 2016

Richard Liaw Sanjay Krishnan Animesh Garg Daniel Crankshaw Joseph Gonzalez Kenneth Y. Goldberg

Rather than learning new control policies for each new task, it is possible, when tasks share some structure, to compose a "meta-policy" from previously learned policies. This paper reports results from experiments using Deep Reinforcement Learning on a continuous-state, discrete-action autonomous driving simulator. We explore how Deep Neural Networks can represent meta-policies that switch amo...

متن کامل

Empirical Results on Convergence and Exploration in Approximate Policy Iteration

2005

Niket S. Kaisare Jong Min Lee Jay H. Lee

In this paper, we empirically investigate the convergence properties of policy iteration applied to the optimal control of systems with continuous state and action spaces. We demonstrate that policy iteration requires lesser iterations than value iteration to converge, but requires more function evaluations to generate cost-to-go approximations in the policy evaluation step. Two different alter...

متن کامل

Fixed point Ishikawa iterations

Journal: :Journal of Mathematical Analysis and Applications 1992

متن کامل

Solving equations by iterations

Journal: :Časopis pro pěstování matematiky a fysiky 1928

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید