نتایج جستجو برای: critic

تعداد نتایج: 2831  

Journal: :Neural computation 2000
Kenji Doya

This article presents a reinforcement learning framework for continuous-time dynamical systems without a priori discretization of time, state, and action. Based on the Hamilton-Jacobi-Bellman (HJB) equation for infinite-horizon, discounted reward problems, we derive algorithms for estimating value functions and improving policies with the use of function approximators. The process of value func...

2015
Haibo He Zhen Ni Xiangnan Zhong

A Boundedness Theoretical Analysis for GrADPDesign: A Case Study on Maze Navigation Report Title A new theoretical analysis towards the goal representation adaptive dynamic programming (GrADP) design proposed in [1], [2] is investigated in this paper. Unlike the proofs of convergence for adaptive dynamic programming (ADP) in literature, here we provide a new insight for the error bound between ...

1998
Eiji Mizutani Stuart E. Dreyfus

In this paper we describe how an actor critic rein forcement learning agent in a non Markovian domain nds an optimal sequence of actions in a totally model free fashion that is the agent neither learns transitional probabilities and associated rewards nor by how much the state space should be augmented so that the Markov prop erty holds In particular we employ an Elman type re current neural ne...

Journal: :عرفان معاصر 0
نعیم عموری أستاذ مساعد فی قسم اللغة العربیة وآدابها بجامعة شهید چمران اهواز

in the islamic civilization and our era, a lot of literary critics became famous, but we hardly ever remember the writer of alghadir, alameh abdulhussein amini, as a great literary critic. in fact he is skillful in his work and use exact meaning in his literature works. yet the writers have forgotten him as a literary critic or theyhave indicated that they forget him. in this essay, i would lik...

Journal: :CoRR 2017
Audrunas Gruslys Mohammad Gheshlaghi Azar Marc G. Bellemare Rémi Munos

In this work we present a new reinforcement learning agent, called Reactor (for Retraceactor), based on an off-policy multi-step return actor-critic architecture. The agent uses a deep recurrent neural network for function approximation. The network outputs a target policy π (the actor), an action-value Q-function (the critic) evaluating the current policy π, and an estimated behavioural policy...

Journal: :The Journal of neuroscience : the official journal of the Society for Neuroscience 2011
Ryan K Jessup John P O'Doherty

Reinforcement learning theory has generated substantial interest in neurobiology, particularly because of the resemblance between phasic dopamine and reward prediction errors. Actor-critic theories have been adapted to account for the functions of the striatum, with parts of the dorsal striatum equated to the actor. Here, we specifically test whether the human dorsal striatum--as predicted by a...

2006
Derong Liu

Adaptive critic control is an advanced control technology developed for nonlinear dynamical systems in recent years. It is based on the idea of approximate dynamic programming. Dynamic programming was introduced by Bellman in the 1950’s for solving optimal control problems of nonlinear dynamical systems. Due to its high computational complexity, applications of dynamic programming have been lim...

Journal: :Environmental Health Perspectives 2009

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید