action value function

نتایج جستجو برای: action value function

تعداد نتایج: 2342819 فیلتر نتایج به سال:

An Actor/Critic Algorithm that is Equivalent to Q-Learning

1994

Robert H. Crites Andrew G. Barto

We prove the convergence of an actor/critic algorithm that is equivalent to Q-learning by construction. Its equivalence is achieved by encoding Q-values within the policy and value function of the actor and critic. The resultant actor/critic algorithm is novel in two ways: it updates the critic only when the most probable action is executed from any given state, and it rewards the actor using c...

متن کامل

The Physiological Action and Therapeutic Value of Sclerotic Acid

Journal: :The American Journal of the Medical Sciences 1880

متن کامل

ANTIPYRIN IN RHEUMATISM; ITS VALUE AND MODE OF ACTION.

Journal: :Journal of the American Medical Association 1887

متن کامل

Enkinaesthesia: Proto-moral value in action-enquiry and interaction

Journal: :Phenomenology and the Cognitive Sciences 2017

متن کامل

Student engagement with sustainability: understanding the value–action gap

Journal: :International Journal of Sustainability in Higher Education 2014

متن کامل

Context Transfer in Reinforcement Learning Using Action-Value Functions

Journal: :Computational Intelligence and Neuroscience 2014

متن کامل

Using Behavioral Economics to Reduce the Value-Action Gap

Journal: :Ökologisches Wirtschaften - Fachzeitschrift 2020

متن کامل

Boundary-Value Problems for Differential Systems with Pulsed Action

Journal: :Journal of Mathematical Sciences 2023

We establish the coefficient necessary and sufficient conditions for existence of solutions weakly nonlinear boundary-value problems systems ordinary differential equations with pulsed action at fixed times propose an iterative algorithm construction these solutions.

متن کامل

XCSF with tile coding in discontinuous action-value landscapes

Journal: :Evolutionary Intelligence 2015

متن کامل

Error reducing sampling in reinforcement learning

2006

Bruno Scherrer

In reinforcement learning, an agent collects information interacting with an environment and uses it to derive a behavior. This paper focuses on efficient sampling; that is, the problem of choosing the interaction samples so that the corresponding behavior tends quickly to the optimal behavior. Our main result is a sensitivity analysis relating the choice of sampling any stateaction pair to the...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید