policy iterations

نتایج جستجو برای: policy iterations

تعداد نتایج: 276392 فیلتر نتایج به سال:

Dynamic asynchronous iterations

Journal: :Journal of Parallel and Distributed Computing 2022

Many problems can be solved by iteration multiple participants (processors, servers, routers etc.). Previous mathematical models for such asynchronous iterations assume a single function being iterated fixed set of participants. We will call static since the system's configuration does not change. However in several real-world examples, as inter-domain routing, both and change frequently while ...

متن کامل

Identifying key steps in developing a one-stop shop for health policy and system information in a limited-resource setting: A case study

Journal: Journal of Research and Health 2022

Boniface Mutatina, John Norman Lavis, Nelson Kawulukusi Sewankambo, Robert Kanyarutokye Basaza,

Background: There is limited understanding about the development of the online one-stop shops for evidence in a limited-resource setting, such as Uganda. This study aimed to provide a comprehensive account of the development process of the online resource for local policy and systems-relevant information in this setting. Methods: We utilized a case study design to address our objective where ...

متن کامل

Learning to Optimize

Journal: :CoRR 2016

Ke Li Jitendra Malik

Algorithm design is a laborious process and often requires many iterations of ideation and validation. In this paper, we explore automating algorithm design and present a method to learn an optimization algorithm. We approach this problem from a reinforcement learning perspective and represent any particular optimization algorithm as a policy. We learn an optimization algorithm using guided pol...

متن کامل

Policy iterations for reinforcement learning problems in continuous time and space — Fundamental theory and methods

Journal: :Automatica 2021

Policy iteration (PI) is a recursive process of policy evaluation and improvement for solving an optimal decision-making/control problem, or in other words, reinforcement learning (RL) problem. PI has also served as the fundamental developing RL methods. In this paper, we propose two methods, called differential (DPI) integral (IPI), their variants, general framework continuous time space (CTS)...

متن کامل

Iterations of BRAF

Journal: :Nature Reviews Cancer 2012

متن کامل

Iterations of Emptying

Journal: :Journal of Humanistic Mathematics 2021

متن کامل

Iterations and fixpoints

Journal: :Pacific Journal of Mathematics 1977

متن کامل

Conservative and Greedy Approaches to Classification-Based Policy Iteration

2012

Mohammad Ghavamzadeh Alessandro Lazaric

The existing classification-based policy iteration (CBPI) algorithms can be divided into two categories: direct policy iteration (DPI) methods that directly assign the output of the classifier (the approximate greedy policy w.r.t. the current policy) to the next policy, and conservative policy iteration (CPI) methods in which the new policy is a mixture distribution of the current policy and th...

متن کامل

Supplementary material of the CVPR’17 Viraliency: Pooling Local Virality

2017

Xavier Alameda-Pineda Andrea Pilzer Dan Xu Nicu Sebe Elisa Ricci Bruno Kessler

We implemented our LENA pooling layer within the Caffe framework and ran all our experiments using a Tesla K40 GPU. All the networks were fine-tuned from the convolutional filters obtained when training these networks for the 1,000 image classification task on the ImageNet dataset. We iterated the stochastic gradient descent algorithm for 10,000 iterations with a momentum of μ = 0.9 and a weigh...

متن کامل

On asynchronous iterations

Journal: :Journal of Computational and Applied Mathematics 2000

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید