نتایج جستجو برای: passive critic

تعداد نتایج: 73280  

Journal: :دراسات الادب المعاصر 0
علی صیادانی أستاذ مساعد فی قسم اللغة العربیة وآدابها؛ جامعة الشهید مدنی بأذربیجان- إیران. محمدصالح شریف عسکری أستاذ مشارک فی قسم اللغة العربیة وآدابها، جامعة الخوارزمی- إیران. مهدی شفائی طالب الدکتوراه فی قسم اللغة العربیة وآدابها، جامعة الخوارزمی- إیران.

most of the researches do not differentiate between intuitive critic and passive one – in the field of features, applications and methods. this critical school is called emotional or passive critics which emphasizes aesthetics. thus sensitivity is not the only appliance to wisdom but is a mean to inspire happiness or unhappiness feeling. consequently the passive (critical) school is the result ...

2008
Jia Ma Tao Yang Zeng-Guang Hou Min Tan

Vibration isolation control is the critical issue to guarantee the performance of various vibration-sensitive instruments and sensors in practical engineering systems. In this paper, single network adaptive critic (SNAC) based controllers are developed for vibration isolation applications. The SNAC approach differs from the typical action-critic dual network structure in adaptive critic designs...

1999
F. L. Lewis

Two feedback control systems are designed that employ the adaptive critic architecture, which consists of two neural networks, one of which (the critic) tunes the other. The first application is a deadzone compensator, where it is shown that the adaptive critic structure is a natural consequence of the mathematical problem of inversion of an unknown function. In this situation the adaptive crit...

2011
Victor Gabillon Alessandro Lazaric Mohammad Ghavamzadeh Bruno Scherrer

In this paper, we study the effect of adding a value function approximation component (critic) to rollout classification-based policy iteration (RCPI) algorithms. The idea is to use a critic to approximate the return after we truncate the rollout trajectories. This allows us to control the bias and variance of the rollout estimates of the action-value function. Therefore, the introduction of a ...

Journal: :SIAM J. Control and Optimization 2003
Vijay R. Konda John N. Tsitsiklis

In this article, we propose and analyze a class of actor-critic algorithms. These are two-time-scale algorithms in which the critic uses temporal difference learning with a linearly parameterized approximation architecture, and the actor is updated in an approximate gradient direction, based on information provided by the critic. We show that the features for the critic should ideally span a su...

2002
XIAOQUN LIAO

Intelligent industrial and mobile robots may be considered proven technology in structured environments. Teach programming and supervised learning methods permit solutions to a variety of applications. However, we believe that to extend the operation of these machines to more unstructured environments requires a new learning method. Both unsupervised learning and reinforcement learning are pote...

Journal: :CoRR 2017
Bo Dai Albert Shaw Niao He Lihong Li Le Song

This paper proposes a new actor-critic-style algorithm called Dual Actor-Critic or Dual-AC. It is derived in a principled way from the Lagrangian dual form of the Bellman optimality equation, which can be viewed as a two-player game between the actor and a critic-like function, which is named as dual critic. Compared to its actor-critic relatives, Dual-AC has the desired property that the actor...

1996
Andrew Ireland Alan Bundy

In earlier papers a critic for automatically generalizing conjectures in the context of failed inductive proofs was presented. The critic exploits the partial success of the search control heuristic known as rippling. Through empirical testing a natural generalization and extension of the basic critic emerged. Here we describe our extended generalization critic together with some promising expe...

1994
Robert H. Crites Andrew G. Barto

We prove the convergence of an actor/critic algorithm that is equivalent to Q-learning by construction. Its equivalence is achieved by encoding Q-values within the policy and value function of the actor and critic. The resultant actor/critic algorithm is novel in two ways: it updates the critic only when the most probable action is executed from any given state, and it rewards the actor using c...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید