نتایج جستجو برای: bellman

تعداد نتایج: 4956  

Journal: :Journal of Mathematical Analysis and Applications 1984

Journal: :Indiana University Mathematics Journal 2000

Journal: :International Journal of Pure and Apllied Mathematics 2013

Journal: :Proceedings of the ISCIE International Symposium on Stochastic Systems Theory and its Applications 2000

Journal: :Journal of Economic Theory 2021

We show that competitive equilibria in a range of models related to production networks can be recovered as solutions dynamic programs. Although these programs fail contractive, we prove they are tractable. As an illustration, treat Coase's theory the firm, chains with transaction costs, and multiple partners. then how same techniques extend other equilibrium decision problems, such distributio...

1993
Ronald J. Williams Leemon C. Baird

Consider a given value function on states of a Markov decision problem, as might result from applying a reinforcement learning algorithm. Unless this value function equals the corresponding optimal value function, at some states there will be a discrepancy, which is natural to call the Bellman residual, between what the value function speciies at that state and what is obtained by a one-step lo...

2017
Matthieu Geist Bilal Piot Olivier Pietquin

This paper aims at theoretically and empirically comparing two standard optimization criteria for Reinforcement Learning: i) maximization of the mean value and ii) minimization of the Bellman residual. For that purpose, we place ourselves in the framework of policy search algorithms, that are usually designed to maximize the mean value, and derive a method that minimizes the residual ‖T∗vπ − vπ...

2017
Zoran Tiganj Karthik H. Shankar Marc W. Howard

Natural learners must compute an estimate of future outcomes that follow from a stimulus in continuous time. Critically, the learner cannot in general know a priori the relevant time scale over which meaningful relationships will be observed. Widely used reinforcement learning algorithms discretize continuous time and use the Bellman equation to estimate exponentially-discounted future reward. ...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید