q policy

نتایج جستجو برای: q policy

تعداد نتایج: 381585 فیلتر نتایج به سال:

Globalization–Income Inequality Nexus in the Post-Soviet Countries: Analysis of Heterogeneous Dataset Using the Quantiles via Moments Approach

Journal: :Mathematics 2023

Deglobalization, as opposed to the term globalization, appears in world order due local solutions problems and border controls, ignoring principles of treaties, trade wars, expansion regionalism. In addition, slowbalization helps shrink global flow trade, information, societal cultural exchange dynamism. However, this scary order, triggered by deglobalization slowbalization, significantly impac...

متن کامل

Discrete-Time Multi-Player Games Based on Off-Policy Q-Learning

Journal: :IEEE Access 2019

متن کامل

Single-Agent vs. Multi-Agent Techniques for Concurrent Reinforcement Learning of Negotiation Dialogue Policies

2014

Kallirroi Georgila Claire Nelson David R. Traum

We use single-agent and multi-agent Reinforcement Learning (RL) for learning dialogue policies in a resource allocation negotiation scenario. Two agents learn concurrently by interacting with each other without any need for simulated users (SUs) to train against or corpora to learn from. In particular, we compare the Qlearning, Policy Hill-Climbing (PHC) and Win or Learn Fast Policy Hill-Climbi...

متن کامل

A novel method for measuring health care system performance: experience from QIDS in the Philippines.

Journal: :Health policy and planning 2009

Orville Solon Kimberly Woo Stella A Quimbo Riti Shimkhada Jhiedon Florentino John W Peabody

OBJECTIVES Measuring and monitoring health system performance is important albeit controversial. Technical, logistic and financial challenges are formidable. We introduced a system of measurement, which we call Q, to measure the quality of hospital clinical performance across a range of facilities. This paper describes how Q was developed, implemented in hospitals in the Philippines and how it ...

متن کامل

Qualitative Measurements of Policy Discrepancy for Return-Based Deep Q-Network

Journal: :IEEE Transactions on Neural Networks and Learning Systems 2020

متن کامل

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

Journal: :CoRR 2017

Tuomas Haarnoja Aurick Zhou Pieter Abbeel Sergey Levine

Model-free deep reinforcement learning (RL) algorithms have been demonstrated on a range of challenging decision making and control tasks. However, these methods typically suffer from two major challenges: very high sample complexity and brittle convergence properties, which necessitate meticulous hyperparameter tuning. Both of these challenges severely limit the applicability of such methods t...

متن کامل

Fluid Models for Production-Inventory Systems

2006

Keqi Yan Vidyadhar G. Kulkarni Amarjit Budhiraja Tugrul Sanli Jayashankar M. Swaminathan

Keqi Yan: Fluid Models for Production-Inventory Systems (Under the direction of Professor Vidyadhar G. Kulkarni) We consider a single stage production-inventory system whose production and demand rates are modulated by a finite state Markov chain called the environment. Supplementary orders can be placed from external suppliers when needed. We model this system by a fluid-flow system and derive...

متن کامل

Joint Hybrid Repair and Remanufacturing Systems and Supply Control

2016

F. Berthaut A. Gharbi R. Pellerin

The control of a stochastic manufacturing system that executes capital asset repairs and remanufacturing in an integrated system is examined. The remanufacturing resources respond to planned returns of worn-out equipments at the end of their expected life and unplanned returns triggered by major equipment failures. Remanufacturing operations for planned demand can be executed at different rates...

متن کامل

Two-Timescale Q-Learning with an Application to Routing in Communication Networks

2006

Mohan Babu Shalabh Bhatnagar

We propose two variants of the Q-learning algorithm that (both) use two timescales. One of these updates Q-values of all feasible state-action pairs at each instant while the other updates Q-values of states with actions chosen according to the ‘current’ randomized policy updates. A sketch of convergence of the algorithms is shown. Finally, numerical experiments using the proposed algorithms fo...

متن کامل

Symbolic Learning for Adaptive Agents

2003

Joshua Cole John Lloyd Kee Siong Ng

This paper investigates an approach to designing and building adaptive agents. The main contribution is the use of a symbolic machine learning system for approximating the policy and Q functions that are at the heart of the agent. Under the assumption that sufficient knowledge of the application domain is available, it is shown how this knowledge can be provided to the agent in the form of symb...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید