نتایج جستجو برای: q policy

تعداد نتایج: 381585  

Journal: :Mathematics 2023

Deglobalization, as opposed to the term globalization, appears in world order due local solutions problems and border controls, ignoring principles of treaties, trade wars, expansion regionalism. In addition, slowbalization helps shrink global flow trade, information, societal cultural exchange dynamism. However, this scary order, triggered by deglobalization slowbalization, significantly impac...

2014
Kallirroi Georgila Claire Nelson David R. Traum

We use single-agent and multi-agent Reinforcement Learning (RL) for learning dialogue policies in a resource allocation negotiation scenario. Two agents learn concurrently by interacting with each other without any need for simulated users (SUs) to train against or corpora to learn from. In particular, we compare the Qlearning, Policy Hill-Climbing (PHC) and Win or Learn Fast Policy Hill-Climbi...

Journal: :Health policy and planning 2009
Orville Solon Kimberly Woo Stella A Quimbo Riti Shimkhada Jhiedon Florentino John W Peabody

OBJECTIVES Measuring and monitoring health system performance is important albeit controversial. Technical, logistic and financial challenges are formidable. We introduced a system of measurement, which we call Q, to measure the quality of hospital clinical performance across a range of facilities. This paper describes how Q was developed, implemented in hospitals in the Philippines and how it ...

Journal: :IEEE Transactions on Neural Networks and Learning Systems 2020

Journal: :CoRR 2017
Tuomas Haarnoja Aurick Zhou Pieter Abbeel Sergey Levine

Model-free deep reinforcement learning (RL) algorithms have been demonstrated on a range of challenging decision making and control tasks. However, these methods typically suffer from two major challenges: very high sample complexity and brittle convergence properties, which necessitate meticulous hyperparameter tuning. Both of these challenges severely limit the applicability of such methods t...

2006
Keqi Yan Vidyadhar G. Kulkarni Amarjit Budhiraja Tugrul Sanli Jayashankar M. Swaminathan

Keqi Yan: Fluid Models for Production-Inventory Systems (Under the direction of Professor Vidyadhar G. Kulkarni) We consider a single stage production-inventory system whose production and demand rates are modulated by a finite state Markov chain called the environment. Supplementary orders can be placed from external suppliers when needed. We model this system by a fluid-flow system and derive...

2016
F. Berthaut A. Gharbi R. Pellerin

The control of a stochastic manufacturing system that executes capital asset repairs and remanufacturing in an integrated system is examined. The remanufacturing resources respond to planned returns of worn-out equipments at the end of their expected life and unplanned returns triggered by major equipment failures. Remanufacturing operations for planned demand can be executed at different rates...

2006
Mohan Babu Shalabh Bhatnagar

We propose two variants of the Q-learning algorithm that (both) use two timescales. One of these updates Q-values of all feasible state-action pairs at each instant while the other updates Q-values of states with actions chosen according to the ‘current’ randomized policy updates. A sketch of convergence of the algorithms is shown. Finally, numerical experiments using the proposed algorithms fo...

2003
Joshua Cole John Lloyd Kee Siong Ng

This paper investigates an approach to designing and building adaptive agents. The main contribution is the use of a symbolic machine learning system for approximating the policy and Q functions that are at the heart of the agent. Under the assumption that sufficient knowledge of the application domain is available, it is shown how this knowledge can be provided to the agent in the form of symb...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید