action value function

نتایج جستجو برای: action value function

تعداد نتایج: 2342819 فیلتر نتایج به سال:

the impact of obligation to assignment of ownership (tamlīk) in imāmī jurisprudence and iranian law

Journal: :فقه و اصول 0

in ordinary cases, the object of transaction in such contracts as sale (bay‘) and lease (ijāra) has financcial value like property (‘ayn) or profit, or physical action. however, whether obligation to performing an action has a financial value and can be the object of transaction is a matter of debate. in adition, in cases in which the subject of the obligation is the assignment of ownership (ta...

متن کامل

Cooperative behavior acquistion by asynchronous policy renewal that enables simultaneous learning in multiagent enviroment

2003

Shoichi Ikenoue Minoru Koh Hosoda

This paper presents a method for simultaneous learning in multiagent environment to emerge the cooperative behaviors. Each agent has one policy and one action value function: the former is for action execution based on the the action value function updated in the previous stage, and the latter is for learning based on the episodes experienced by the 2-greedy method. This makes all agents behave...

متن کامل

estimating economic values of land and family labor in producing rice

Journal: :اقتصاد و توسعه کشاورزی 0

دشتی دشتی جوادی جوادی عارف عشقی عارف عشقی

abstract since about 34% of areas under rice cultivation in whole country is attributed to gilan; in this study we pay attention to some economical dimensions, especially economic values of inputs in this region. data were collected from 80 rice producers in 2007-2008. consider to importance the parametric approach in production structure and factors demand, applying seemingly unrelated regress...

متن کامل

Action-Gap Phenomenon in Reinforcement Learning

2011

Amir Massoud Farahmand

Many practitioners of reinforcement learning problems have observed that oftentimes the performance of the agent reaches very close to the optimal performance even though the estimated (action-)value function is still far from the optimal one. The goal of this paper is to explain and formalize this phenomenon by introducing the concept of the action-gap regularity. As a typical result, we prove...

متن کامل

Generalized Rapid Action Value Estimation

2015

Tristan Cazenave

Monte Carlo Tree Search (MCTS) is the state of the art algorithm for many games including the game of Go and General Game Playing (GGP). The standard algorithm for MCTS is Upper Confidence bounds applied to Trees (UCT). For games such as Go a big improvement over UCT is the Rapid Action Value Estimation (RAVE) heuristic. We propose to generalize the RAVE heuristic so as to have more accurate es...

متن کامل

Striatal action - value neurons reconsidered

2017

Yonatan Loewenstein

8 It is generally believed that during economic decisions, striatal neurons represent the values 9 associated with different actions. This hypothesis is based on a large number of 10 electrophysiological studies, in which the neural activity of striatal neurons was measured 11 while the subject was learning to prefer the more rewarding action. Here we present an 12 alternative interpretation of...

متن کامل

Boosted Fitted Q-Iteration

2017

Samuele Tosatto Matteo Pirotta Carlo D'Eramo Marcello Restelli

This paper is about the study of B-FQI, an Approximated Value Iteration (AVI) algorithm that exploits a boosting procedure to estimate the action-value function in reinforcement learning problems. B-FQI is an iterative off-line algorithm that, given a dataset of transitions, builds an approximation of the optimal action-value function by summing the approximations of the Bellman residuals acros...

متن کامل

Multiagent Planning with Factored MDPs

2001

Carlos Guestrin Daphne Koller Ronald Parr

We present a principled and efficient planning algorithm for cooperative multiagent dynamic systems. A striking feature of our method is that the coordination and communication between the agents is not imposed, but derived directly from the system dynamics and function approximation architecture. We view the entire multiagent system as a single, large Markov decision process (MDP), which we as...

متن کامل

Mutation Operator Evolution for EA-Based Neural Networks

2005

Ryan J. Meuth

In the field of artificial intelligence, one of the hardest things that we can try to make a computer program do is to interact with the real world. In contrast to the well-defined, discrete, simplified world that programs normally operate in, the real world is large, unknown, and complex. In the real world, programs must learn and adapt to new and changing situations in order to be effective. ...

متن کامل

Uniform Value in Recursive Games

2000

EILON SOLAN NICOLAS VIEILLE

We address the problem of existence of the uniform value in recursive games. We give two existence results: (i) the uniform value is shown to exist if the state space is countable, the action sets are finite and if, for some a > 0, there are finitely many states in which the limsup value is less than a; (ii) for games with nonnegative payoff function, it is sufficient that the action set of pla...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید