نتایج جستجو برای: action value function
تعداد نتایج: 2342819 فیلتر نتایج به سال:
chapter one is devoted to a moderate discussion on preliminaries, according to our requirements. chapter two which is based on our work in (24) is devoted introducting weighted semigroups (s, w), and studying some famous function spaces on them, especially the relations between go (s, w) and other function speces are invesigated. in fact this chapter is a complement to (32). one of the main fea...
Value-based reinforcement learning typically involves the repeated application of an update rule, such as the Bellman operator TB, to an action-value function. Recent work has explored the use of alternative operators, which remain optimality-preserving and may result in improved performance. In this report, I study in particular the advantage learning operator, TALQ = TBQ − α(V − Q). A theoret...
let $g$ be a molecular graph with vertex set $v(g)$, $d_g(u, v)$ the topological distance between vertices $u$ and $v$ in $g$. the hosoya polynomial $h(g, x)$ of $g$ is a polynomial $sumlimits_{{u, v}subseteq v(g)}x^{d_g(u, v)}$ in variable $x$. in this paper, we obtain an explicit analytical expression for the expected value of the hosoya polynomial of a random benzenoid chain with $n$ hexagon...
translation as a comunicative process is always said to be associated with various aspects of meaning loss or gain. subtitling as a mode of translating, due to special discoursal and textual conditions imposed upon it, is believed to be an obvious case of this loss or gain. presenting the spoken sound track of a film in writing and synchronizing the perception of this text by the viewers with...
This work describes a novel algorithm that integrates an adaptive resonance method (ARM), i.e. an ART-based algorithm with a self-organized design, and a Q-learning algorithm. By dynamically adjusting the size of sensitivity regions of each neuron and adaptively eliminating one of the redundant neurons, ARM can preserve resources, i.e. available neurons, to accommodate additional categories. As...
in this paper, he's highly prolic variational iteration method is applied ef-fectively for showing the existence, uniqueness and solving a class of singularsecond order two point boundary value problems. the process of nding solu-tion involves generation of a sequence of appropriate and approximate iterativesolution function equally likely to converge to the exact solution of the givenpr...
objective(s): the present study was conducted to examine whether the standardized uptake value (suv) may be affected by the spatial position of a lesion in the radial direction on positron emission tomography (pet) images, obtained via two methods based on time-of-flight (tof) reconstruction and point spread function (psf). methods: a cylinder phantom with the sphere (30mm diameter), located in...
an enterprise resource planning (erp) software is needed for industries and companies that want to develop in future. many of the manufactures and companies have a problem with erp software selection. an inappropriate selection process can affect both the implementation and the performance of the company significantly. although several models are proposed to solve this problem many of them did n...
We examine the stability of rational expectations equilibria in the class of models in which the decision of the individual agent is discontinuous with respect to the state variables. Instead of rational expectations, each agent learns the unknown parameters through a recursive stochastic algorithm. If the agents the estimated value function ``rapidly'' enough, then each agent learns the true v...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید