نتایج جستجو برای: critic

تعداد نتایج: 2831  

1997
Danil V. Prokhorov Lee A. Feldkamp

We propose a simple framework for critic-based training of recurrent neural networks and feedback controllers. We term the critics that are used primitive adaptive critics, since we represent them with the simplest possible architecture (bias weight only). We derive this framework from two main premises. The first of these is a natural similarity between a form of approximate dynamic programmin...

2016
Manish Sharma Ajay Verma

This paper is concerned with the observer designing problem for a class of uncertain delayed nonlinear systems using reinforcement learning. Reinforcement learning is used via two Wavelet Neural networks (WNN), critic WNN and action WNN, which are combined to form an adaptive WNN controller. The “strategic” utility function is approximated by the critic WNN and is minimized by the action WNN. A...

2014
Guido Bacciagaluppi

This chapter comments on that by Chris Fuchs on qBism. It presents some mild criticisms of this view, some based on the EPR and Wigner’s friend scenarios, and some based on the quantum theory of measurement. A few alternative suggestions for implementing a subjectivist interpretation of probability in quantum mechanics conclude the chapter. “M. Braque est un jeune homme fort audacieux. [...] Il...

Journal: :Research and Practice in Technology Enhanced Learning 2010
Yeonjoo Oh Mark D. Gross Suguru Ishizaki Ellen Yi-Luen Do

This paper reports on the Furniture Design Critic. We propose a computational model of design critiquing using the program, which as a research tool helps us explain how to select critiquing methods in the consideration of critiquing conditions. Surveying the literature of architectural education, we have identified two dimensions from critiquing comments: (1) delivery types (interpretation, in...

2001
Konstantinos C. Zikidis Spyros G. Tzafestas

-Function approximation has been used extensively with rein forcement learning, even though theoretical support was based mainly on tabular representations. This paper proposes an actor-critic structure following the existing convergence proofs as much as possible. The actor and critic modules employ an adaptive neuro-fuzzy architecture based on fuzzy ARTMAP concepts and gradient descent. Resul...

2007
Barry G. Silverman Christo Andonyadis Yair Rajwan Alfredo Morales

This paper reviews efforts to design an environment for authoring group performance support system (GPSS) agents that can interact with remote internet resources, applications, and users. We review the architecture of the GPSS environment and show how its tutor/critic elements do and don’t map into an idealized specification extracted from the agent oriented programming and inter-agent communic...

Journal: :CoRR 2017
Yemi Okesanjo Victor Kofia

Off-policy stochastic actor-critic methods rely on approximating the stochastic policy gradient in order to derive an optimal policy. One may also derive the optimal policy by approximating the action-value gradient. The use of action-value gradients is desirable as policy improvement occurs along the direction of steepest ascent. This has been studied extensively within the context of natural ...

2010
Vladimir G. Red'ko Danil V. Prokhorov

We study a model of evolving populations of self-learning agents and analyze the interaction between learning and evolution. We consider agent-brokers that predict stock price changes and use these predictions for selecting actions. Each agent is equipped with a neural network adaptive critic design for behavioral adaptation. We discuss three cases in which either learning, or evolution, or bot...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید