نتایج جستجو برای: action value function

تعداد نتایج: 2342819  

Journal: :IEEJ Transactions on Electronics, Information and Systems 2004

2003
PAWEL WAWRZYNSKI

In reference to methods analyzed recently by Sutton et al, and Konda & Tsitsiklis, we propose their modification called Randomized Policy Optimizer (RPO). The algorithm has a modular structure and is based on the value function rather than on the action-value function. The modules include neural approximators and a parameterized distribution of control actions. The distribution must belong to a...

Journal: :international journal of agricultural management and development 2014
nasrin ohadi jaleh kurki nejad

given the strategic remarkable rank of pistachio in non-oil exports, inputs’ management in its production is so important. as the scarcest input in agricultural sector, water is considered to be among the most important inputs of pistachio production.water inadequate supply and limate conditions increase water demand in pistachio growing areas. it is necessary to determine the real value or pri...

Journal: :زبان شناسی و گویش های خراسان 0
حمیدرضا شعیری سعیده انتظاری ملکی

he main objective of the advertising discourse is to encourage its viewer or reader to buy goods. this dialogue contains several inductive functions, including inductive functions of action and tension. the function of the induced action is an action-induced origin, whereas the function of the inductive tension is induced by the action originated. thus, tension-induced functions can be claimed ...

Journal: :journal of agricultural science and technology 2013
m. azamzadeh shouraki s. khalilian s. a. mortazavi

production subsidies, as a part of the strategy of economic growth of the agricultural sector, are of great importance around the world. subsidizing production inputs, particularly energy input, is another way of directing subsidy to the agricultural sector. in this research, production function of the agricultural sector was estimated using econometric methods and time series data. after calcu...

Journal: :Adaptive Behaviour 1997
Juan Carlos Santamaría Richard S. Sutton Ashwin Ram

A key element in the solution of reinforcement learning problems is the value function The purpose of this function is to measure the long term utility or value of any given state The function is important because an agent can use this measure to decide what to do next A common problem in reinforcement learning when applied to systems having continuous states and action spaces is that the value...

1993
Ronald J. Williams Leemon C. Baird

Consider a given value function on states of a Markov decision problem, as might result from applying a reinforcement learning algorithm. Unless this value function equals the corresponding optimal value function, at some states there will be a discrepancy, which is natural to call the Bellman residual, between what the value function speciies at that state and what is obtained by a one-step lo...

2009
Alejandro Agostini Enric Celaya

A Reinforcement Learning problem is formulated as trying to find the action policy that maximizes the accumulated reward received by the agent through time. One of the most popular algorithms used in RL is QLearning which uses an action-value function q(s,a) to evaluate the expectation of the maximum future cumulative reward that will be obtained from executing action a in situation s. Q-Learni...

2018

State-action value functions (i.e., Q-values) are ubiquitous in reinforcement learning (RL), giving rise to popular algorithms such as SARSA and Q-learning. We propose a new notion of action value defined by a Gaussian smoothed version of the expected Q-value. We show that such smoothed Q-values still satisfy a Bellman equation, making them learnable from experience sampled from an environment....

2017

State-action value functions (i.e., Q-values) are ubiquitous in reinforcement learning (RL), giving rise to popular algorithms such as SARSA and Q-learning. We propose a new notion of action value defined by a Gaussian smoothed version of the expected Q-value. We show that such smoothed Q-values still satisfy a Bellman equation, making them learnable from experience sampled from an environment....

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید