Efficient Reinforcement Learning in Deterministic Systems with Value Function Generalization
نویسندگان
چکیده
Authors are encouraged to submit new papers to INFORMS journals by means of a style file template, which includes the journal title. However, use of a template does not certify that the paper has been accepted for publication in the named journal. INFORMS journal templates are for the exclusive purpose of submitting to an INFORMS journal and should not be used to distribute the papers in print or online or to submit the papers to another publication.
منابع مشابه
Efficient Exploration and Value Function Generalization in Deterministic Systems
We consider the problem of reinforcement learning over episodes of a finitehorizon deterministic system and as a solution propose optimistic constraint propagation (OCP), an algorithm designed to synthesize efficient exploration and value function generalization. We establish that when the true value function Q⇤ lies within the hypothesis class Q, OCP selects optimal actions over all but at mos...
متن کاملBatch Reinforcement Learning for Spoken Dialogue Systems with Sparse Value Function Approximation
In this paper, we propose to combine sample-efficient generalization frameworks for RL with a feature selection algorithm for the learning of an optimal spoken dialogue system (SDS) strategy.
متن کاملDeciding to Specialize and Respecialize a Value Function for Relational Reinforcement Learning
We investigate the matter of feature selection in the context of relational reinforcement learning. We had previously hypothesized that it is more efficient to specialize a value function quickly, making specializations that are potentially suboptimal as a result, and to later modify that value function in the event that the agent gets it “wrong.” Here we introduce agents with the ability to ad...
متن کاملOn Determinism Handling While Learning Reduced State Space Representations
When applying a Reinforcement Learning technique to problems with continuous or very large state spaces, some kind of generalization is required. In the bibliography, two main approaches can be found. On one hand, the generalization problem can be defined as an approximation problem of the continuous value function, typically solved with neural networks. On the other hand, other approaches disc...
متن کاملGeneralization and Exploration via Randomized Value Functions
We propose randomized least-squares value iteration (RLSVI) – a new reinforcement learning algorithm designed to explore and generalize efficiently via linearly parameterized value functions. We explain why versions of least-squares value iteration that use Boltzmann or -greedy exploration can be highly inefficient, and we present computational results that demonstrate dramatic efficiency gains...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Math. Oper. Res.
دوره 42 شماره
صفحات -
تاریخ انتشار 2017