نتایج جستجو برای: action value function

تعداد نتایج: 2342819  

Journal: :فقه و اصول 0

in ordinary cases, the object of transaction in such contracts as sale (bay‘) and lease (ijāra) has financcial value like property (‘ayn) or profit, or physical action. however, whether obligation to performing an action has a financial value and can be the object of transaction is a matter of debate. in adition, in cases in which the subject of the obligation is the assignment of ownership (ta...

2003
Shoichi Ikenoue Minoru Koh Hosoda

This paper presents a method for simultaneous learning in multiagent environment to emerge the cooperative behaviors. Each agent has one policy and one action value function: the former is for action execution based on the the action value function updated in the previous stage, and the latter is for learning based on the episodes experienced by the 2-greedy method. This makes all agents behave...

Journal: :اقتصاد و توسعه کشاورزی 0
دشتی دشتی جوادی جوادی عارف عشقی عارف عشقی

abstract since about 34% of areas under rice cultivation in whole country is attributed to gilan; in this study we pay attention to some economical dimensions, especially economic values of inputs in this region. data were collected from 80 rice producers in 2007-2008. consider to importance the parametric approach in production structure and factors demand, applying seemingly unrelated regress...

2011
Amir Massoud Farahmand

Many practitioners of reinforcement learning problems have observed that oftentimes the performance of the agent reaches very close to the optimal performance even though the estimated (action-)value function is still far from the optimal one. The goal of this paper is to explain and formalize this phenomenon by introducing the concept of the action-gap regularity. As a typical result, we prove...

2015
Tristan Cazenave

Monte Carlo Tree Search (MCTS) is the state of the art algorithm for many games including the game of Go and General Game Playing (GGP). The standard algorithm for MCTS is Upper Confidence bounds applied to Trees (UCT). For games such as Go a big improvement over UCT is the Rapid Action Value Estimation (RAVE) heuristic. We propose to generalize the RAVE heuristic so as to have more accurate es...

2017
Yonatan Loewenstein

8 It is generally believed that during economic decisions, striatal neurons represent the values 9 associated with different actions. This hypothesis is based on a large number of 10 electrophysiological studies, in which the neural activity of striatal neurons was measured 11 while the subject was learning to prefer the more rewarding action. Here we present an 12 alternative interpretation of...

2017
Samuele Tosatto Matteo Pirotta Carlo D'Eramo Marcello Restelli

This paper is about the study of B-FQI, an Approximated Value Iteration (AVI) algorithm that exploits a boosting procedure to estimate the action-value function in reinforcement learning problems. B-FQI is an iterative off-line algorithm that, given a dataset of transitions, builds an approximation of the optimal action-value function by summing the approximations of the Bellman residuals acros...

2001
Carlos Guestrin Daphne Koller Ronald Parr

We present a principled and efficient planning algorithm for cooperative multiagent dynamic systems. A striking feature of our method is that the coordination and communication between the agents is not imposed, but derived directly from the system dynamics and function approximation architecture. We view the entire multiagent system as a single, large Markov decision process (MDP), which we as...

2005
Ryan J. Meuth

In the field of artificial intelligence, one of the hardest things that we can try to make a computer program do is to interact with the real world. In contrast to the well-defined, discrete, simplified world that programs normally operate in, the real world is large, unknown, and complex. In the real world, programs must learn and adapt to new and changing situations in order to be effective. ...

2000
EILON SOLAN NICOLAS VIEILLE

We address the problem of existence of the uniform value in recursive games. We give two existence results: (i) the uniform value is shown to exist if the state space is countable, the action sets are finite and if, for some a > 0, there are finitely many states in which the limsup value is less than a; (ii) for games with nonnegative payoff function, it is sufficient that the action set of pla...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید