نتایج جستجو برای: critic

تعداد نتایج: 2831  

2008
Francisco S. Melo Manuel Lopes

In this paper we address reinforcement learning problems with continuous state-action spaces. We propose a new algorithm, tted natural actor-critic (FNAC), that extends the work in [1] to allow for general function approximation and data reuse. We combine the natural actor-critic architecture [1] with a variant of tted value iteration using importance sampling. The method thus obtained combines...

Journal: :Neural networks : the official journal of the International Neural Network Society 2012
Feng Liu Jian Sun Jennie Si Wentao Guo Shengwei Mei

Approximate/adaptive dynamic programming (ADP) has been studied extensively in recent years for its potential scalability to solve large state and control space problems, including those involving continuous states and continuous controls. The applicability of ADP algorithms, especially the adaptive critic designs has been demonstrated in several case studies. Direct heuristic dynamic programmi...

2010
Angustae Vitae

The study of intertextuality, the shaping of a text’s meaning by other texts, remains a laborious process for the literary critic. Kristeva (Kristeva, 1986) suggests that "Any text is constructed as a mosaic of quotations; any text is the absorption and transformation of another.& The nature of these mosaics is widely varied, from direct quotations representing a simple and overt intertextualit...

2003
Stephen Shervais Thaddeus T. Shannon George G. Lendaris

This work supported in part by the National Science Foundation under grant ECS-9904378. Abstract Adaptive critic based approximate dynamic programming techniques are gradient based methods for finding optimal policies for multi-stage decision processes. We believe adaptive critic methods are now developed to the point that they can be applied to the full spectrum of decision and control problem...

Journal: :Neurocomputing 2012
Haibo He Zhen Ni Jian Fu

In this paper, we propose a novel adaptive dynamic programming (ADP) architecture with three networks, an action network, a critic network, and a reference network, to develop internal goalrepresentation for online learning and optimization. Unlike the traditional ADP design normally with an action network and a critic network, our approach integrates the third network, a reference network, int...

Journal: :Soft Comput. 2013
Dongbin Zhao Bin Wang Derong Liu

A novel supervised Actor–Critic (SAC) approach for adaptive cruise control (ACC) problem is proposed in this paper. The key elements required by the SAC algorithm namely Actor and Critic, are approximated by feed-forward neural networks respectively. The output of Actor and the state are input to Critic to approximate the performance index function. A Lyapunov stability analysis approach has be...

Journal: :CoRR 2017
Zhewei Huang Shuchang Zhou BoEr Zhuang Xinyu Zhou

We introduce an Actor-Critic Ensemble(ACE) method for improving the performance of Deep Deterministic Policy Gradient(DDPG) algorithm1. At inference time, our method uses a critic ensemble to select the best action from proposals of multiple actors running in parallel. By having a larger candidate set, our method can avoid actions that have fatal consequences, while staying deterministic. Using...

Journal: :CoRR 2017
Ivo Danihelka Balaji Lakshminarayanan Benigno Uria Daan Wierstra Peter Dayan

We train a generator by maximum likelihood and we also train the same generator architecture by Wasserstein GAN. We then compare the generated samples, exact log-probability densities and approximate Wasserstein distances. We show that an independent critic trained to approximate Wasserstein distance between the validation set and the generator distribution helps detect overfitting. Finally, we...

ژورنال: :فصلنامه علمی پژوهشی باغ نظر 2011
سید عبالهادی دانشپور ایمان رئیسی

این تحقیق با استفاده از روش تطبیقی انجام شده است. ابتدا معنا و تعریف واژه ی criticرا، با استفاده از 4 فرهنگ لغات انگلیسی شناخته شده (وبستر1، آکسفورد2، لانگ من3 و امریکن هریتیج4) استخراج نموده و ضمن مقایسه ی معانی با هم، واژگان مترادف مورد استفاده در هر فرهنگ به دست آمده است. سپس با توجه به میزان فراوانی هر واژه، پنج واژه ی analyse، judge، evaluate، appraise، assess از میان واژگان انتخاب شده و ت...

2014
Abhijit Gosavi

Actor-critic algorithms are amongst the most well-studied reinforcement learning algorithms that can be used to solve Markov decision processes (MDPs) via simulation. Unfortunately, the parameters of the so-called “actor” in the classical actor-critic algorithm exhibit great volatility — getting unbounded in practice, whence they have to be artificially constrained to obtain solutions in practi...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید