bellman

We show that competitive equilibria in a range of models related to production networks can be recovered as solutions dynamic programs. Although these programs fail contractive, we prove they are tractable. As an illustration, treat Coase's theory the firm, chains with transaction costs, and multiple partners. then how same techniques extend other equilibrium decision problems, such distributio...

متن کامل

Tight Performance Bounds on Greedy Policies Based on Imperfect Value Functions

1993

Ronald J. Williams Leemon C. Baird

Consider a given value function on states of a Markov decision problem, as might result from applying a reinforcement learning algorithm. Unless this value function equals the corresponding optimal value function, at some states there will be a discrepancy, which is natural to call the Bellman residual, between what the value function speciies at that state and what is obtained by a one-step lo...

متن کامل

Is the Bellman residual a bad proxy?

2017

Matthieu Geist Bilal Piot Olivier Pietquin

This paper aims at theoretically and empirically comparing two standard optimization criteria for Reinforcement Learning: i) maximization of the mean value and ii) minimization of the Bellman residual. For that purpose, we place ourselves in the framework of policy search algorithms, that are usually designed to maximize the mean value, and derive a method that minimizes the residual ‖T∗vπ − vπ...

متن کامل

Scale invariant value computation for reinforcement learning

2017

Zoran Tiganj Karthik H. Shankar Marc W. Howard

Natural learners must compute an estimate of future outcomes that follow from a stimulus in continuous time. Critically, the learner cannot in general know a priori the relevant time scale over which meaningful relationships will be observed. Widely used reinforcement learning algorithms discretize continuous time and use the Bellman equation to estimate exponentially-discounted future reward. ...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید