action value function

نتایج جستجو برای: action value function

تعداد نتایج: 2342819 فیلتر نتایج به سال:

Supervised Reinforcement Learning via Value Function

Journal: :Symmetry 2019

متن کامل

Pseudorehearsal in value function approximation

Journal: :CoRR 2017

Vladimir Marochko Leonard Johard Manuel Mazzara

Catastrophic forgetting is of special importance in reinforcement learning, as the data distribution is generally non-stationary over time. We study and compare several pseudorehearsal approaches for Qlearning with function approximation in a pole balancing task. We have found that pseudorehearsal seems to assist learning even in such very simple problems, given proper initialization of the reh...

متن کامل

Value Function Based Production Scheduling

1998

Jeff G. Schneider Justin A. Boyan Andrew W. Moore

Production scheduling, the problem of sequentially con guring a factory to meet forecasted demands, is a critical problem throughout the manufacturing industry. The requirement of maintaining product inventories in the face of unpredictable demand and stochastic factory output makes standard scheduling models, such as job-shop, inadequate. Currently applied algorithms, such as simulated anneali...

متن کامل

Least-Squares Policy Iteration

Journal: :Journal of Machine Learning Research 2003

Michail G. Lagoudakis Ronald Parr

We propose a new approach to reinforcement learning for control problems which combines value-function approximation with linear architectures and approximate policy iteration. This new approach is motivated by the least-squares temporal-difference learning algorithm (LSTD) for prediction problems, which is known for its efficient use of sample experiences compared to pure temporal-difference a...

متن کامل

Discriminant Value of Thyroid Function Tests

Journal: :BMJ 1973

متن کامل

Structure and Function of Value-Orientation

Journal: :Japanese Sociological Review 1962

متن کامل

Fitted Q-iteration in continuous action-space MDPs

2007

András Antos Rémi Munos Csaba Szepesvári

We consider continuous state, continuous action batch reinforcement learning where the goal is to learn a good policy from a sufficiently rich trajectory generated by some policy. We study a variant of fitted Q-iteration, where the greedy action selection is replaced by searching for a policy in a restricted set of candidate policies by maximizing the average action values. We provide a rigorou...

متن کامل

Interval Estimation for Reinforcement-Learning Algorithms in Continuous-State Domains: Supplementary Material

2010

Martha White Adam White

We prove the consistency and coverage error for the bootstrapped studentized interval around our the sample mean of the sequence of parameters for global function approximation. Each parameter vector θt on time step t corresponds to the action-value functionQt on that time step, withQ(s, a) = f(θ, s, a) for some bounded function f . A common example of f is a linear function f(θ, s, a) = θφ(s, ...

متن کامل

Physiological Interpretation of Munsell Value Function

Journal: :IEEJ Transactions on Fundamentals and Materials 1990

متن کامل

generalization of titchmarsh's theorem for the dunkl transform

Journal: :international journal of mathematical modelling and computations 0

salah el ouadih university radouan daher .

using a generalized spherical mean operator, we obtain a generalization of titchmarsh's theorem for the dunkl transform for functions satisfying the ('; p)-dunkl lipschitz condition in the space lp(rd;wl(x)dx), 1 < p 6 2, where wl is a weight function invariant under the action of an associated re ection group.

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید