reward packages

نتایج جستجو برای: reward packages

تعداد نتایج: 46029 فیلتر نتایج به سال:

An Alternative Softmax Operator for Reinforcement Learning

2017

Kavosh Asadi Michael L. Littman

A softmax operator applied to a set of values acts somewhat like the maximization function and somewhat like an average. In sequential decision making, softmax is often used in settings where it is necessary to maximize utility but also to hedge against problems that arise from putting all of one’s weight behind a single maximum utility decision. The Boltzmann softmax operator is the most commo...

متن کامل

Gradient Algorithms for Exploration/Exploitation Trade-Offs: Global and Local Variants

2012

Michel Tokic Günther Palm

Gradient-following algorithms are deployed for efficient adaptation of exploration parameters in temporal-difference learning with discrete action spaces. Global and local variants are evaluated in discrete and continuous state spaces. The global variant is memory efficient in terms of requiring exploratory data only for starting states. In contrast, the local variant requires exploratory data ...

متن کامل

Comparisons of Hadronic Shower Packages

2004

GEORGIOS MAVROMANOLAKIS

The high precision measurements needed to exploit the physics potential of an ee Future Linear Collider with 0.5 1 TeV center-of-mass energy range set strict requirements on performance of vertex, tracking and calorimetric detectors. The CALICE Collaboration [1] has been formed to conduct the research and development effort needed to bring initial conceptual designs for the calorimetry to a fin...

متن کامل

Investigation of Effort-Reward Imbalance Model as predictor of Counterproductive Work Behaviors

ژورنال: طب کار 2021

babamiri, mohammad, heydari, bahareh, Moradi Tamadon, tahmineh, Mortezapour Soufiani, Alireza,

Introduction: Nowadays, counterproductive behaviors have become a common and costly position for many organizations, and Managers of organizations are always looking for a suitable and practical solution to reduce this type of behavior in their organization. Due to the importance of the subject, the present study aims to investigate the imbalance of effort and reward as a predictor of counterpr...

متن کامل

Orbitofrontal neurons signal reward predictions, not reward prediction errors

Journal: :Neurobiology of Learning and Memory 2018

متن کامل

Total Reward Stochastic Games and Sensitive Average Reward Strategies

Journal: :Journal of Optimization Theory and Applications 1998

متن کامل

Public Viewing Packages Channel Lineups

2015

A&E . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 ABC Family . . . . . . . . . . . . . . . . . . . . . . . . . 311 America’s Auction Network . . . . . . . . . . . . . . . . . . . 324 Animal Planet . . . . . . . . . . . . . . . . . . . . . . . 282 Aqui2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401 AUDIENCE . . . . . . . . . . . . . . . . . . . . . ....

متن کامل

Towards Effective Tourism Dynamic Packages

Journal: :IRMJ 2012

Luís Ferreira Goran D. Putnik Maria Manuela Cunha Zlata Putnik

This paper describes the Open Tourism Initiative (OTI) as a framework to support tourism activities, following the Tourism Virtual Enterprise (TVE) organizational model and pragmatics based collaboration decisions. To assure the better alignment among tourism services providers and client’s expectations, the framework (and its architecture) must support reliable interoperability and dynamic net...

متن کامل

Effects of modified atmosphere packaging with a silicon gum film as a window for gas exchange on Agrocybe chaxingu storage

2007

Tiehua Li Min Zhang Shaojin Wang

The edible mushroom Agrocybe chaxingu was stored in packages with or without silicon gum film windows in three different modified atmosphere systems (5% O2, with 5%, 10% and 15% CO2) at a temperature of 3± 1 ◦C. The results showed that there were significant differences between the packages with and without the silicon gum film windows on O2, CO2, and ethylene concentrations, respiration rate, ...

متن کامل

Scaling Reinforcement Learning toward RoboCup Soccer

2001

Peter Stone Richard S. Sutton

RoboCup simulated soccer presents many challenges to reinforcement learning methods, including a large state space, hidden and uncertain state, multiple agents, and long and variable delays in the e ects of actions. We describe our application of episodic SMDP Sarsa( ) with linear tile-coding function approximation and variable to learning higher-level decisions in a keepaway subtask of RoboCup...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید