نتایج جستجو برای: q policy

تعداد نتایج: 381585  

Journal: :Wireless Networks 2014
Prashanth L. A. Abhranil Chatterjee Shalabh Bhatnagar

In this paper, we consider an intrusion detection application for Wireless Sensor Networks (WSNs). We study the problem of scheduling the sleep times of the individual sensors, where the objective is to maximize the network lifetime while keeping the tracking error to a minimum. We formulate this problem as a partially-observable Markov decision process (POMDP) with continuous state-action spac...

1999
Qiming He Mark A. Shayman Mark Shayman

This paper presents a fast Reinforcement Learning (RL) algorithm to solve Partially Observable Markov Decision Processes (POMDP) problem. The proposed algorithm is devised to provide a policy-making framework for Network Management Systems (NMS) which is in essence an engineering application without an exact model. The algorithm consists of two phases. Firstly, the model is estimated and policy...

Journal: :CoRR 2017
Anna Harutyunyan Peter Vrancx Pierre-Luc Bacon Doina Precup Ann Nowé

A temporally abstract action, or an option, is specified by a policy and a termination condition: the policy guides option behavior, and the termination condition roughly determines its length. Generally, learning with longer options (like learning with multi-step returns) is known to be more efficient. However, if the option set for the task is not ideal, and cannot express the primitive optim...

Journal: :Computers & Industrial Engineering 2012
Liwei Bai Christos Alexopoulos Mark E. Ferguson Kwok-Leung Tsui

Generally, the derivation of an inventory policy requires the knowledge of the underlying demand distribution. Unfortunately, in many settings such as retail, demand is not completely observable in a direct way or inventory records may be inaccurate. A variety of factors, including the potential inaccuracy of inventory records, motivate retailers to seek replenishment policies with a fixed orde...

Journal: :International Journal of Computational Intelligence and Applications 2006
Dean C. Wardell Gilbert L. Peterson

Received (received date) Revised (revised date) Reinforcement learning is one of the more attractive machine learning technologies, due to its unsupervised learning structure and ability to continually learn even as the operating environment changes. Additionally, by applying reinforcement learning to multiple cooperative software agents (a multi-agent system) not only allows each individual ag...

Journal: :Journal of Monetary Economics 2022

• We quantify spillbacks from US monetary policy. use structural scenario analysis and minimum relative entropy methods. Spillbacks reflect a non-trivial share of the domestic effect They materialise through Tobin’s q/cash flow stock market wealth effects. Spillovers policy entail to economy. Applying counterfactual analyses in Bayesian proxy vector-autoregressive model we find that account for...

Journal: :School Effectiveness and School Improvement 2023

In order to support research on school effectiveness, there is a need for valid and reliable instruments assess policymaking capacities of schools. Increasingly, seen as shared responsibility the entire pedagogical team school. this article, data were analysed from sample 1,696 (care) teachers coordinators principals 77 Flemish primary schools critical aspects concerning validity reliability Po...

Journal: :Journal of Combinatorial Theory, Series A 2016

Journal: :Automatica 2022

A novel reinforcement learning algorithm is introduced for multiarmed restless bandits with average reward, using the paradigms of Q-learning and Whittle index. Specifically, we leverage structure index policy to reduce search space Q-learning, resulting in major computational gains. Rigorous convergence analysis provided, supported by numerical experiments. The experiments show excellent empir...

Journal: :Computer Communications 2006
Tricha Anjali Caterina M. Scoglio

In this paper, a new optimal policy is introduced to determine, adapt, and protect the Generalized MultiProtocol Label Switching (GMPLS) network topology based on the current traffic load. The Integrated Traffic Engineering (ITE) paradigm provides mechanisms for dynamic addition of physical capacity to optical networks. In the absence of such mechanisms, the rejection of incoming requests may b...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید