passive critic

A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems

Journal: :Automatica 2013

Shubhendu Bhasin R. Kamalapurkar Marcus Johnson Kyriakos G. Vamvoudakis Frank L. Lewis Warren E. Dixon

An online adaptive reinforcement learning-based solution is developed for the infinite-horizon optimal control problem for continuous-time uncertain nonlinear systems. A novel actor–critic–identifier (ACI) is proposed to approximate the Hamilton–Jacobi–Bellman equation using three neural network (NN) structures—actor and critic NNs approximate the optimal control and the optimal value function,...

متن کامل

Towards Feature Selection In Actor-Critic Algorithms

2007

Khashayar Rohanimanesh Nicholas Roy Russ Tedrake

Choosing features for the critic in actor-critic algorithms with function approximation is known to be a challenge. Too few critic features can lead to degeneracy of the actor gradient, and too many features may lead to slower convergence of the learner. In this paper, we show that a well-studied class of actor policies satisfy the known requirements for convergence when the actor features are ...

متن کامل

Forward Actor-Critic for Nonlinear Function Approximation in Reinforcement Learning

2017

Vivek Veeriah Harm van Seijen Richard S. Sutton

Multi-step methods are important in reinforcement learning (RL). Eligibility traces, the usual way of handling them, works well with linear function approximators. Recently, van Seijen (2016) had introduced a delayed learning approach, without eligibility traces, for handling the multi-step λ-return with nonlinear function approximators. However, this was limited to action-value methods. In thi...

متن کامل

Design and implementation of an adaptive critic-based neuro-fuzzy controller on an unmanned bicycle

Journal: :CoRR 2017

Ali Shafiekhani Mohammad J. Mahjoob Mehdi Akraminia

Abstract: Fuzzy critic-based learning forms a reinforcement learning method based on dynamic programming. In this paper, an adaptive critic-based neuro-fuzzy system is presented for an unmanned bicycle. The only information available for the critic agent is the system feedback which is interpreted as the last action performed by the controller in the previous state. The signal produced by the c...

متن کامل

Findings of the Panel of Psychological Inquiry Convened at Saint Michael’s College, May 13, 2008: The Case of “Anna”

2011

RONALD B. MILLER MARC KESSLER MARION BAUER SANDRA HOWELL KENNETH KREILING

This paper briefly describes the proceedings of the Panel of Inquiry held May 13, 2008 at Saint Michael’s College on the case of “Anna" (Podetz, 2008, 2011). It summarizes the advocate's and critic's positions on four claims and one counter-claim. The five judges independently voted to accept all four of the advocate’s claims (by votes of 5-0 or 4-1), and rejected the critic's counterclaim by a...

متن کامل

Deep graph convolutional reinforcement learning for financial portfolio management – DeepPocket

Journal: :Expert Systems With Applications 2021

Portfolio management aims at maximizing the return on investment while minimizing risk by continuously reallocating assets forming portfolio. These are not independent but correlated during a short time period. A graph convolutional reinforcement learning framework called DeepPocket is proposed whose objective to exploit time-varying interrelations between financial instruments. represented nod...

متن کامل

Online Learning of Optimal Control Solutions Using Integral Reinforcement Learning and Neural Networks

2011

Kyriakos G. Vamvoudakis Draguna Vrabie Frank L. Lewis

In this paper we introduce an online algorithm that uses integral reinforcement knowledge for learning the continuous-time optimal control solution for nonlinear systems with infinite horizon costs and partial knowledge of the system dynamics. This algorithm is a data based approach to the solution of the Hamilton-Jacobi-Bellman equation and it does not require explicit knowledge on the system’...

متن کامل

Model-free Adaptive Dynamic Programming for Optimal Control of Discrete-time Affine Nonlinear System ⋆

2014

Zhongpu Xia Dongbin Zhao Huajin Tang

In this paper, a model-free and effective approach is proposed to solve infinite horizon optimal control problem for affine nonlinear systems based on adaptive dynamic programming technique. The developed approach, referred to as the actor-critic structure, employs two multilayer perceptron neural networks to approximate the state-action value function and the control policy, respectively. It u...

متن کامل

An RLS-Based Natural Actor-Critic Algorithm for Locomotion of a Two-Linked Robot Arm

2005

Jooyoung Park Jongho Kim Daesung Kang

Recently, actor-critic methods have drawn much interests in the area of reinforcement learning, and several algorithms have been studied along the line of the actor-critic strategy. This paper studies an actor-critic type algorithm utilizing the RLS(recursive least-squares) method, which is one of the most efficient techniques for adaptive signal processing, together with natural policy gradien...

متن کامل

Sustainable ℓ2-regularized actor-critic based on recursive least-squares temporal difference learning

2017

Luntong Li Dazi Li Tianheng Song

Least-squares temporal difference learning (LSTD) has been used mainly for improving the data efficiency of the critic in actor-critic (AC). However, convergence analysis of the resulted algorithms is difficult when policy is changing. In this paper, a new AC method is proposed based on LSTD under discount criterion. The method comprises two components as the contribution: (1) LSTD works in an ...

متن کامل