مدل reward beta

Mrmc Wireless Mesh Networks

2011

Mohammad A Hoque Xiaoyan Hong

The wireless mesh networksare considered as one of the vital elements in today’s converged networks,providing high bandwidth and connectivity over large geographical areas. Mesh routers equipped with multiple radios can significantly overcome the capacity problem and increase the aggregate throughput of the network where single radio nodessuffer from performancedegradation. Moreover, the market...

متن کامل

A randomized controlled trial of a behavioral economic intervention for alcohol and marijuana use.

Journal: :Experimental and clinical psychopharmacology 2015

Ali M Yurasek Ashley A Dennhardt James G Murphy

A recent study demonstrated that a single 50-min supplemental session that targeted the behavioral economic mechanisms of substance-free reinforcement and delayed reward discounting (Substance-Free Activity Session: SFAS) enhanced the efficacy of a standard alcohol brief motivational intervention (BMI) for college drinkers. The purpose of the current study was to conduct a randomized controlled...

متن کامل

An Alternative Softmax Operator for Reinforcement Learning

2017

Kavosh Asadi Michael L. Littman

A softmax operator applied to a set of values acts somewhat like the maximization function and somewhat like an average. In sequential decision making, softmax is often used in settings where it is necessary to maximize utility but also to hedge against problems that arise from putting all of one’s weight behind a single maximum utility decision. The Boltzmann softmax operator is the most commo...

متن کامل

Gradient Algorithms for Exploration/Exploitation Trade-Offs: Global and Local Variants

2012

Michel Tokic Günther Palm

Gradient-following algorithms are deployed for efficient adaptation of exploration parameters in temporal-difference learning with discrete action spaces. Global and local variants are evaluated in discrete and continuous state spaces. The global variant is memory efficient in terms of requiring exploratory data only for starting states. In contrast, the local variant requires exploratory data ...

متن کامل

Investigation of Effort-Reward Imbalance Model as predictor of Counterproductive Work Behaviors

ژورنال: طب کار 2021

babamiri, mohammad, heydari, bahareh, Moradi Tamadon, tahmineh, Mortezapour Soufiani, Alireza,

Introduction: Nowadays, counterproductive behaviors have become a common and costly position for many organizations, and Managers of organizations are always looking for a suitable and practical solution to reduce this type of behavior in their organization. Due to the importance of the subject, the present study aims to investigate the imbalance of effort and reward as a predictor of counterpr...

متن کامل

Orbitofrontal neurons signal reward predictions, not reward prediction errors

Journal: :Neurobiology of Learning and Memory 2018

متن کامل

Total Reward Stochastic Games and Sensitive Average Reward Strategies

Journal: :Journal of Optimization Theory and Applications 1998

متن کامل

مهاجرت فیبروبلاست و رسوب کلاژن در روند التیام زخم پوستی: مدل بندی ریاضی و استنباطهای پزشکی

پایان نامه :وزارت علوم، تحقیقات و فناوری - دانشگاه تبریز 1388

کبری عبدلی, کریم ایواز, کریم شهرآرا,

اینکه تا چه اندازه هم محوری کلاژن در طول التیام زخم پوستی رخ می دهد، درجه بافت جوشگاهی را تعیین می کند. این موضوع را با استفاده از روش چند بخشی مدل بندی کرده ایم. در این مدل، مواد خارج سلولی برای مثال کلاژن و فیبرین به صورت پیوسته مدل بندی می شوند در حالیکه سلولها به صورت واحدهای مجزا در نظر گرفته می شوند. با این مدل، آثار پارامترهای مختلف را روی روند هم محوری بررسی کرده ایم و همچنین از مدل است...

15 صفحه اول

Scaling Reinforcement Learning toward RoboCup Soccer

2001

Peter Stone Richard S. Sutton

RoboCup simulated soccer presents many challenges to reinforcement learning methods, including a large state space, hidden and uncertain state, multiple agents, and long and variable delays in the e ects of actions. We describe our application of episodic SMDP Sarsa( ) with linear tile-coding function approximation and variable to learning higher-level decisions in a keepaway subtask of RoboCup...

متن کامل

Multi-Radio Multi-Channel (MRMC) Resource Optimization Method for Wireless Mesh Network

Journal: :J. Inf. Sci. Eng. 2016

Degan Zhang Yanan Zhu Si Liu Xiaodan Zhang Jinjie Song

DEGAN ZHANG†, YANAN ZHU‡, SI LIU, XIAODAN ZHANG AND JINJIE SONG Key Laboratory of Computer Vision and System Ministry of Education, Tianjin Tianjin Key Lab of Intelligent Computing and Novel Software Technology Tianjin University of Technology Tianjin, 300384 P.R. China School of Electrical and Information Engineering The University of Sydney Sydney, NSW 2006, Australia Institute of Scientific ...

متن کامل