نتایج جستجو برای: reward processes
تعداد نتایج: 554393 فیلتر نتایج به سال:
Multivariate reward processes with reward functions of constant rates, defined on a semi-Markov process, first were studied by Masuda and Sumita, 1991. Reward processes with nonlinear reward functions were introduced in Soltani, 1996. In this work we study a multivariate process , , where are reward processes with nonlinear reward functions respectively. The Laplace transform of the covar...
We introduce a class of learning problems where the agent is presented with a series of tasks. Intuitively, if there is a relation among those tasks, then the information gained during execution of one task has value for the execution of another task. Consequently, the agent is intrinsically motivated to explore its environment beyond the degree necessary to solve the current task it has at han...
We consider a learning problem where the decision maker interacts with a standard Markov decision process, with the exception that the reward functions vary arbitrarily over time. We show that, against every possible realization of the reward process, the agent can perform as well—in hindsight—as every stationary policy. This generalizes the classical no-regret result for repeated games. Specif...
In robust Markov decision processes (MDPs), the uncertainty in transition kernel is addressed by finding a policy that optimizes worst-case performance over an set of MDPs. While much literature has focused on discounted MDPs, average-reward MDPs remain largely unexplored. this paper, we focus where goal to find average reward set. We first take approach approximates using prove value function ...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید