critic

A Supervised Goal Directed Algorithm in Economical Choice Behaviour: An Actor-Critic Approach

Journal: :CoRR 2013

Keyvan Yahya

This paper aims to find an algorithmic structure that affords to predict and explain the economical choice behaviour particularly under uncertainty(random policies) by manipulating the prevalent Actor-Critic learning method to comply with the requirements we have been entrusted ever since the field of neuroeconomics dawned on us. Whilst skimming some basics of neuroeconomics that might be relev...

متن کامل

Policy Search in Reproducing Kernel Hilbert Space

2016

Ngo Anh Vien Peter Englert Marc Toussaint

Modeling policies in reproducing kernel Hilbert space (RKHS) renders policy gradient reinforcement learning algorithms non-parametric. As a result, the policies become very flexible and have a rich representational potential without a predefined set of features. However, their performances might be either non-covariant under reparameterization of the chosen kernel, or very sensitive to step-siz...

متن کامل

An Online Actor Critic Algorithm and a Statistical Decision Procedure for Personalizing Intervention

2016

Huitian Lei

An Online Actor Critic Algorithm and a Statistical Decision Procedure for Personalizing Intervention by Huitian Lei Chair: Professor Susan A. Murphy Assistant Professor Ambuj Tewari Increasing technological sophistication and widespread use of smartphones and wearable devices provide opportunities for innovative health interventions. An Adaptive Intervention (AI) personalizes the type, mode and...

متن کامل

Class-confidence critic combining

2002

Matti Aksela Ramunas Girdziusas Jorma Laaksonen Erkki Oja Jari Kangas

This paper discusses a combination of two techniques for improving the recognition accuracy of on-line handwritten character recognition: committee classification and adaptation to the user. A novel adaptive committee structure, namely the Class-Confidence Critic Combination (CCCC) scheme, is presented and evaluated. It is shown to be able to improve significantly on its member classifiers. Als...

متن کامل

A Critic for LISP

1987

Gerhard Fischer

Our goal is to establish the conceptual foundations for using the computational power that is or will be available on computer systems. Much of the available computing power is wasted, however, if users have difficulty understanding and llsing the full potential of these systems. Too much attention in the past has been given to the technology of computer systems and not enough to the effects of...

متن کامل

Quieten your inner critic.

Journal: :CMAJ : Canadian Medical Association journal = journal de l'Association medicale canadienne 2009

Julie Strong

متن کامل

Demystifying MMD GANs

Journal: :CoRR 2018

Mikolaj Binkowski Dougal J. Sutherland Michael Arbel Arthur Gretton

We investigate the training and performance of generative adversarial networks using the Maximum Mean Discrepancy (MMD) as critic, termed MMD GANs. As our main theoretical contribution, we clarify the situation with bias in GAN loss functions raised by recent work: we show that gradient estimators used in the optimization process for both MMD GANs and Wasserstein GANs are unbiased, but learning...

متن کامل

Pretraining Deep Actor-Critic Reinforcement Learning Algorithms With Expert Demonstrations

Journal: :CoRR 2018

Xiaoqin Zhang Huimin Ma

Pretraining with expert demonstrations have been found useful in speeding up the training process of deep reinforcement learning algorithms since less online simulation data is required. Some people use supervised learning to speed up the process of feature learning, others pretrain the policies by imitating expert demonstrations. However, these methods are unstable and not suitable for actor-c...

متن کامل

Adaptive PID Controller based on Reinforcement Learning for Wind Turbine Control

2012

M. Sedighizadeh A. Rezazadeh

A self tuning PID control strategy using reinforcement learning is proposed in this paper to deal with the control of wind energy conversion systems (WECS). Actor-Critic learning is used to tune PID parameters in an adaptive way by taking advantage of the model-free and on-line learning properties of reinforcement learning effectively. In order to reduce the demand of storage space and to impro...

متن کامل

Actor-Critic Models of Reinforcement Learning in the Basal Ganglia: From Natural to Artificial Rats

Journal: :Adaptive Behaviour 2005

Mehdi Khamassi Loïc Lachèze Benoît Girard Alain Berthoz Agnès Guillot

Since 1995, numerous Actor-Critic architectures for reinforcement learning have been proposed as models of dopamine-like reinforcement learning mechanisms in the rat’s basal ganglia. However, these models were usually tested in different tasks, and it is then difficult to compare their efficiency for an autonomous animat. We present here the comparison of four architectures in an animat as it p...

متن کامل