نتایج جستجو برای: literary critic

تعداد نتایج: 18655  

2000
Stephen Shervais Thaddeus T. Shannon

We demonstrate the possibility of optimal control of physical inventory systems in a nonstationary fitness terrain, based on the combined application of evolutionary search and adaptive critic terrain following. We show that adaptive critic based approximate dynamic programming techniques based on plant-controller Jacobeans can be used with systems characterized by discrete valued states and co...

2013
Merinda Simmons Atalia Omer Jonathan Z. Smith

This essay rejoins Merinda Simmons’s protection of Russell McCutcheon’s critic vs. caretaker dichotomy in her response to my “Can a Critic be a Caretaker Too? Religion, Conflict and Conflict Transformation” (JAAR 2011). While Simmons aims to preserve McCutcheon’s binary as a purportedly benignly unavoidable opposition, I expose the perils of epistemic anti-realism at the heart of that dichotomy...

Journal: :Canadian Journal of Communication 1984

2018
Piji Li Lidong Bing Wai Lam

We present a training framework for neural abstractive summarization based on actor-critic approaches from reinforcement learning. In the traditional neural network based methods, the objective is only to maximize the likelihood of the predicted summaries, no other assessment constraints are considered, which may generate low-quality summaries or even incorrect sentences. To alleviate this prob...

Journal: :CoRR 2017
Yuhuai Wu Elman Mansimov Shun Liao Roger B. Grosse Jimmy Ba

In this work, we propose to apply trust region optimization to deep reinforcement learning using a recently proposed Kronecker-factored approximation to the curvature. We extend the framework of natural policy gradient and propose to optimize both the actor and the critic using Kronecker-factored approximate curvature (K-FAC) with trust region; hence we call our method Actor Critic using Kronec...

2017
Pierre-Luc Bacon Jean Harb Doina Precup

Temporal abstraction is key to scaling up learning and planning in reinforcement learning. While planning with temporally extended actions is well understood, creating such abstractions autonomously from data has remained challenging. We tackle this problem in the framework of options [Sutton, Precup & Singh, 1999; Precup, 2000]. We derive policy gradient theorems for options and propose a new ...

2010
William Dabney Andrew G. Barto

In this paper, we address the critic optimization problem within the context of reinforcement learning. The focus of this problem is on improving an agent’s critic, so as to increase performance over a distribution of tasks. We use ordered derivatives, in a process similar to back propagation through time, to compute the gradient of an agent’s fitness with respect to its reward function. With e...

Journal: :Automatica 2009
Shalabh Bhatnagar Richard S. Sutton Mohammad Ghavamzadeh Mark Lee

We present four new reinforcement learning algorithms based on actor–critic, natural-gradient and function-approximation ideas, and we provide their convergence proofs. Actor–critic reinforcement learning methods are online approximations to policy iteration in which the value-function parameters are estimated using temporal difference learning and the policy parameters are updated by stochasti...

2015
Mark Algee-Hewitt Ryan Heuser Franco Moretti

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید