نتایج جستجو برای: q policy

تعداد نتایج: 381585  

Journal: :Rel. Eng. & Sys. Safety 2005
Wei Xie Yiguang Hong Kishor S. Trivedi

A two-level rejuvenation policy for software systems with degradation process is studied. Both full restarts and partial restarts are considered in this rejuvenation strategy. A semi-Markov process model is constructed, and based on its closed-form solution we obtain the system availability as a bivariate function. Then, the rejuvenation policy is analyzed to maximize the system availability. S...

2002
Ali Raza Butt Sumalatha Adabala Nirav H. Kapadia Renato J. O. Figueiredo José A. B. Fortes

Computational grids provide computing power by sharing resources across administrative domains. This sharing, coupled with the need to execute untrusted code from arbitrary users, introduces security hazards. This paper addresses the security implications of making Q computing resource available to untrusted a&cations via computational grids. It highlights the problems and limitations of curren...

2000
Manu Sridharan Gerald Tesauro

We study the use of single-agent and multi-agent Q-learning to learn seller pricing strategies in three diierent two-seller models of agent economies, using a simple regression tree approximation scheme to represent the Q-functions. Our results are highly encouraging { regression trees match the training times and policy performance of lookup table Q-learning, while ooering signiicant advantage...

2007
Francisco S. Melo M. Isabel Ribeiro

In this paper, we analyze the convergence of Q-learning with linear function approximation. We identify a set of conditions that implies the convergence of this method with probability 1, when a fixed learning policy is used. We discuss the differences and similarities between our results and those obtained in several related works. We also discuss the applicability of this method when a changi...

Journal: :Oper. Res. Lett. 2007
Christian Larsen Gudrun P. Kiesmüller

Developing a closed-form cost expression for an (R,s,nQ) policy where the demand process is compound generalized Erlang Logistics/SCM Research Group 1 Developing a closed-form cost expression for an (R,s,nQ) policy where the demand process is compound generalized Erlang Abstract We derive a closed-form cost expression for an (R,s,nQ) inventory control policy where all replenishment orders have ...

2007
Blaise Thomson Jost Schatzmann Karl Weilhammer Hui Ye Steve Young

Partially Observable Markov Decision Processes provide a principled way to model uncertainty in dialogues. However, traditional algorithms for optimising policies are intractable except for cases with very few states. This paper discusses a new approach to policy optimisation based on grid-based Q-learning with a summary of belief space. We also present a technique for bootstrapping the system ...

2007
Philip Melchiors Rommert Dekker Marcel Kleijn

Whenever demand for a single item can be categorized into classes of di erent priority an inventory rationing policy should be considered In this paper we analyse a continuous review s Q model with lost sales and two demand classes A so called critical level policy is applied to ration the inventory among the two demand classes With this policy low priority demand is rejected in anticipation of...

1997
Dimitri P. Bertsekas

We consider the approximate solution of stochastic optimal control problems using a neurodynamic programming/reinforcement learning methodology. We focus on the computation of a rollout policy, which is obtained by a single policy iteration starting from some known base policy and using some form of exact or approximate policy improvement. We indicate that, in a stochastic environment, the popu...

2007
Yuan-Pao Hsu Kao-Shing Hwang Hsin-Yi Lin

This article presents an algorithm that combines a FAST-based algorithm (Flexible Adaptable-Size Topology), called ARM, and Q-learning algorithm. The ARM is a self organizing architecture. Dynamically adjusting the size of sensitivity regions of each neuron and adaptively pruning one of the redundant neurons, the ARM can preserve resources (available neurons) to accommodate more categories. The...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید