نتایج جستجو برای: action value function
تعداد نتایج: 2342819 فیلتر نتایج به سال:
investigation of desertification trend needs understanding of phenomena creating changes singly or action and reaction together in the manner that these changes were ended up in land degradation. in investigation of pedological criterion onland degradation in quaternary rock units, first, a part of the rude-shoor watershed area was selected. after distinguishing target area, maps of slope class...
Reinforcement learning agents attempt to learn and construct a decision policy which maximises some reward signal. In turn, this policy is directly derived from long-term value estimates of state-action pairs. In environments with real-valued state-spaces, however, it is impossible to enumerate the value of every state-action pair, necessitating the use of a function approximator in order to in...
Convergence is proven of the value-iteration-based algorithm to find the optimal controller in the case of general non-affine in input nonlinear systems. That is, it is shown that algorithm converges to the optimal control and the optimal value function. It is assumed that at each iteration the value and action update equations can be exactly solved. Then two standard neural networks (NN) are u...
We target the problem of closed-loop learning of control policies that map visual percepts to continuous actions. Our algorithm, called Reinforcement Learning of Joint Classes (RLJC), adaptively discretizes the joint space of visual percepts and continuous actions. In a sequence of attempts to remove perceptual aliasing, it incrementally builds a decision tree that applies tests either in the i...
Scaling Reinforcement Learning (RL) to real-world problems with continuous state and action spaces remains a challenge. This is partly due to the reason that the optimal value function can become quite complex in continuous domains. In this paper, we propose to avoid learning the optimal value function at all but to use direct policy search methods in combination with model-based RL instead.
In this article we describe a set of scalable techniques for learning the behavior of a group of agents in a collaborative multiagent setting. As a basis we use the framework of coordination graphs of Guestrin, Koller, and Parr (2002a) which exploits the dependencies between agents to decompose the global payoff function into a sum of local terms. First, we deal with the single-state case and d...
The role that frontal-striatal circuits play in normal behavior remains unclear. Two of the leading hypotheses suggest that these circuits are important for action selection or reinforcement learning. To examine these hypotheses, we carried out an experiment in which monkeys had to select actions in two different task conditions. In the first (random) condition, actions were selected on the bas...
in this work, we study the performance of the sinc-collocation method for solving bratu's problem. for different choices of step size, we consider the maximum absolute errors in the solutions at sinc grid points and tabulated in tables. the comparison of the obtained results veri ed that this method converges to the exact solution rapidly and with
this article is devoted to the study of existence and multiplicity of positive solutions to aclass of nonlinear fractional order multi-point boundary value problems of the type−dq0+u(t) = f(t, u(t)), 1 < q ≤ 2, 0 < t < 1,u(0) = 0, u(1) =m−2∑ i=1δiu(ηi),where dq0+ represents standard riemann-liouville fractional derivative, δi, ηi ∈ (0, 1) withm−2∑i=1δiηi q−1 < 1, and f : [0, 1] × [0, ∞) → [0, ∞...
abstract the third millennium has started, but the world is facing with serious challenges in achieving international security and peace. various human rights violations have lead the states to find means to protect human rights. also article 55 of the united nations charter introduces the respect to human rights and fundamental freedom as the most suitable ways to realize peace and security. ...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید