نتایج جستجو برای: action value function

تعداد نتایج: 2342819  

1994
Robert H. Crites Andrew G. Barto

We prove the convergence of an actor/critic algorithm that is equivalent to Q-learning by construction. Its equivalence is achieved by encoding Q-values within the policy and value function of the actor and critic. The resultant actor/critic algorithm is novel in two ways: it updates the critic only when the most probable action is executed from any given state, and it rewards the actor using c...

Journal: :The American Journal of the Medical Sciences 1880

Journal: :Journal of the American Medical Association 1887

Journal: :Phenomenology and the Cognitive Sciences 2017

Journal: :International Journal of Sustainability in Higher Education 2014

Journal: :Computational Intelligence and Neuroscience 2014

Journal: :Ökologisches Wirtschaften - Fachzeitschrift 2020

Journal: :Journal of Mathematical Sciences 2023

We establish the coefficient necessary and sufficient conditions for existence of solutions weakly nonlinear boundary-value problems systems ordinary differential equations with pulsed action at fixed times propose an iterative algorithm construction these solutions.

2006
Bruno Scherrer

In reinforcement learning, an agent collects information interacting with an environment and uses it to derive a behavior. This paper focuses on efficient sampling; that is, the problem of choosing the interaction samples so that the corresponding behavior tends quickly to the optimal behavior. Our main result is a sensitivity analysis relating the choice of sampling any stateaction pair to the...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید