نتایج جستجو برای: action value function

تعداد نتایج: 2342819  

پایان نامه :وزارت علوم، تحقیقات و فناوری - دانشگاه فردوسی مشهد - دانشکده علوم 1377

chapter one is devoted to a moderate discussion on preliminaries, according to our requirements. chapter two which is based on our work in (24) is devoted introducting weighted semigroups (s, w), and studying some famous function spaces on them, especially the relations between go (s, w) and other function speces are invesigated. in fact this chapter is a complement to (32). one of the main fea...

2016
Greg Farquhar

Value-based reinforcement learning typically involves the repeated application of an update rule, such as the Bellman operator TB, to an action-value function. Recent work has explored the use of alternative operators, which remain optimality-preserving and may result in improved performance. In this report, I study in particular the advantage learning operator, TALQ = TBQ − α(V − Q). A theoret...

Journal: :iranian journal of mathematical chemistry 2016
s.-j. xu q.-h. he s. zhou w. h. chan

let $g$ be a molecular graph with vertex set $v(g)$, $d_g(u, v)$ the topological distance between vertices $u$ and $v$ in $g$. the hosoya polynomial $h(g, x)$ of $g$ is a polynomial $sumlimits_{{u, v}subseteq v(g)}x^{d_g(u, v)}$ in variable $x$. in this paper, we obtain an explicit analytical expression for the expected value of the hosoya polynomial of a random benzenoid chain with $n$ hexagon...

پایان نامه :وزارت علوم، تحقیقات و فناوری - دانشگاه تبریز 1382

‏‎translation as a comunicative process is always said to be associated with various aspects of meaning loss or gain. subtitling as a mode of translating, due to special discoursal and textual conditions imposed upon it, is believed to be an obvious case of this loss or gain. presenting the spoken sound track of a film in writing and synchronizing the perception of this text by the viewers with...

Journal: :Inf. Sci. 2011
Kao-Shing Hwang Hsin-Yi Lin Yuan-Pao Hsu Hung-Hsiu Yu

This work describes a novel algorithm that integrates an adaptive resonance method (ARM), i.e. an ART-based algorithm with a self-organized design, and a Q-learning algorithm. By dynamically adjusting the size of sensitivity regions of each neuron and adaptively eliminating one of the redundant neurons, ARM can preserve resources, i.e. available neurons, to accommodate additional categories. As...

Journal: :نظریه تقریب و کاربرد های آن 0
شادان صدیق بهزادی دانشگاه آزاد اسلامی واحد قزوین

in this paper, he's highly proli c variational iteration method is applied ef-fectively for showing the existence, uniqueness and solving a class of singularsecond order two point boundary value problems. the process of nding solu-tion involves generation of a sequence of appropriate and approximate iterativesolution function equally likely to converge to the exact solution of the givenpr...

Journal: :asia oceania journal of nuclear medicine and biology 0
yasuharu wakabayashi division of radiological technology, saitama prefectural cancer center, saitama, japan kenichi kashikura graduate school of radiological technology, gunma prefectural college of health sciences, gunma, japan yasuyuki takahashi graduate school of radiological technology, gunma prefectural college of health sciences, gunma, japan hitoshi yabe division of health sciences, graduate school of medical sciences, kanazawa university, kanazawa, japan akihiro ichikawa division of molecular imaging, saitama prefectural cancer center, saitama, japan souichi yamamoto division of radiological technology, saitama prefectural cancer center, saitama, japan

objective(s): the present study was conducted to examine whether the standardized uptake value (suv) may be affected by the spatial position of a lesion in the radial direction on positron emission tomography (pet) images, obtained via two methods based on time-of-flight (tof) reconstruction and point spread function (psf). methods: a cylinder phantom with the sphere (30mm diameter), located in...

Journal: :international journal of management and business research 2013
m. h. kamfiroozi a. bonyadinaeini

an enterprise resource planning (erp) software is needed for industries and companies that want to develop in future. many of the manufactures and companies have a problem with erp software selection. an inappropriate selection process can affect both the implementation and the performance of the company significantly. although several models are proposed to solve this problem many of them did n...

Journal: :J. Economic Theory 2001
In-Koo Cho

We examine the stability of rational expectations equilibria in the class of models in which the decision of the individual agent is discontinuous with respect to the state variables. Instead of rational expectations, each agent learns the unknown parameters through a recursive stochastic algorithm. If the agents the estimated value function ``rapidly'' enough, then each agent learns the true v...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید