action value function

compactifications and function spaces on weighted semigruops

پایان نامه :وزارت علوم، تحقیقات و فناوری - دانشگاه فردوسی مشهد - دانشکده علوم 1377

علی اکبر خادم معبودی, محمدعلی پورعبدالله نژاد,

chapter one is devoted to a moderate discussion on preliminaries, according to our requirements. chapter two which is based on our work in (24) is devoted introducting weighted semigroups (s, w), and studying some famous function spaces on them, especially the relations between go (s, w) and other function speces are invesigated. in fact this chapter is a complement to (32). one of the main fea...

15 صفحه اول

The Advantage Learning Operator

2016

Greg Farquhar

Value-based reinforcement learning typically involves the repeated application of an update rule, such as the Bellman operator TB, to an action-value function. Recent work has explored the use of alternative operators, which remain optimality-preserving and may result in improved performance. In this report, I study in particular the advantage learning operator, TALQ = TBQ − α(V − Q). A theoret...

متن کامل

hosoya polynomials of random benzenoid chains

Journal: :iranian journal of mathematical chemistry 2016

s.-j. xu q.-h. he s. zhou w. h. chan

let $g$ be a molecular graph with vertex set $v(g)$, $d_g(u, v)$ the topological distance between vertices $u$ and $v$ in $g$. the hosoya polynomial $h(g, x)$ of $g$ is a polynomial $sumlimits_{{u, v}subseteq v(g)}x^{d_g(u, v)}$ in variable $x$. in this paper, we obtain an explicit analytical expression for the expected value of the hosoya polynomial of a random benzenoid chain with $n$ hexagon...

متن کامل

‏‎interpersonal function of language in subtitling

پایان نامه :وزارت علوم، تحقیقات و فناوری - دانشگاه تبریز 1382

حسن مرتضوی هریس, کاظم لطفی پور ساعدی, محمدعلی ترابی,

‏‎translation as a comunicative process is always said to be associated with various aspects of meaning loss or gain. subtitling as a mode of translating, due to special discoursal and textual conditions imposed upon it, is believed to be an obvious case of this loss or gain. presenting the spoken sound track of a film in writing and synchronizing the perception of this text by the viewers with...

15 صفحه اول

Self-organizing state aggregation for architecture design of Q-learning

Journal: :Inf. Sci. 2011

Kao-Shing Hwang Hsin-Yi Lin Yuan-Pao Hsu Hung-Hsiu Yu

This work describes a novel algorithm that integrates an adaptive resonance method (ARM), i.e. an ART-based algorithm with a self-organized design, and a Q-learning algorithm. By dynamically adjusting the size of sensitivity regions of each neuron and adaptively eliminating one of the redundant neurons, ARM can preserve resources, i.e. available neurons, to accommodate additional categories. As...

متن کامل

Dorsal striatum is necessary for stimulus-value but not action-value learning in humans

Journal: :Brain 2014

متن کامل

application of variational iteration method for solving singular two point boundary value problems

Journal: :نظریه تقریب و کاربرد های آن 0

شادان صدیق بهزادی دانشگاه آزاد اسلامی واحد قزوین

in this paper, he's highly prolic variational iteration method is applied ef-fectively for showing the existence, uniqueness and solving a class of singularsecond order two point boundary value problems. the process of nding solu-tion involves generation of a sequence of appropriate and approximate iterativesolution function equally likely to converge to the exact solution of the givenpr...

متن کامل

evaluation of the effect of tumor position on standardized uptake value using time-of-flight reconstruction and point spread function

Journal: :asia oceania journal of nuclear medicine and biology 0

yasuharu wakabayashi division of radiological technology, saitama prefectural cancer center, saitama, japan kenichi kashikura graduate school of radiological technology, gunma prefectural college of health sciences, gunma, japan yasuyuki takahashi graduate school of radiological technology, gunma prefectural college of health sciences, gunma, japan hitoshi yabe division of health sciences, graduate school of medical sciences, kanazawa university, kanazawa, japan akihiro ichikawa division of molecular imaging, saitama prefectural cancer center, saitama, japan souichi yamamoto division of radiological technology, saitama prefectural cancer center, saitama, japan

objective(s): the present study was conducted to examine whether the standardized uptake value (suv) may be affected by the spatial position of a lesion in the radial direction on positron emission tomography (pet) images, obtained via two methods based on time-of-flight (tof) reconstruction and point spread function (psf). methods: a cylinder phantom with the sphere (30mm diameter), located in...

متن کامل

a hybrid grey-game-mcdm method for erp selecting based on bsc

Journal: :international journal of management and business research 2013

m. h. kamfiroozi a. bonyadinaeini

an enterprise resource planning (erp) software is needed for industries and companies that want to develop in future. many of the manufactures and companies have a problem with erp software selection. an inappropriate selection process can affect both the implementation and the performance of the company signiﬁcantly. although several models are proposed to solve this problem many of them did n...

متن کامل

Convergence of Least Squares Learning in Self-Referential Discontinuous Stochastic Models

Journal: :J. Economic Theory 2001

In-Koo Cho

We examine the stability of rational expectations equilibria in the class of models in which the decision of the individual agent is discontinuous with respect to the state variables. Instead of rational expectations, each agent learns the unknown parameters through a recursive stochastic algorithm. If the agents the estimated value function ``rapidly'' enough, then each agent learns the true v...

متن کامل