q learning

نتایج جستجو برای: q learning

تعداد نتایج: 717428 فیلتر نتایج به سال:

Constrained Deep Q-Learning Gradually Approaching Ordinary Q-Learning

Journal: :Frontiers in Neurorobotics 2019

متن کامل

پیشنهاد استراتژی قیمت دهی فروشندگان انرژی بازار برق با کمک الگوریتم های یادگیری تقویتی مبتنی بر تبرید تدریجی

پایان نامه :وزارت علوم، تحقیقات و فناوری - دانشگاه فردوسی مشهد - دانشکده مهندسی 1393

غزاله محسنی راد, محمد باقر نقیبی سیستانی,

در این پایان نامه مسئله استراتژی قیمت¬دهی تولیدکنندگان انرژی در بازار یکنواخت برق برای رسیدن به حداکثر سود بررسی شده است. فروشندگان انرژی پیشنهادات قیمت خود را با بهره گیری از سه الگوریتم متفاوت q-learning، r-learning و sarsa برای یک ساعت مشخص از شبانه روز به اپراتور سیستم اعلام می کنند. الگوریتم های یادگیری تقویتی ذکر شده با روش انتخاب عمل تبرید تدریجی، ترکیب شده اند. سود بدست آمده برای فروشند...

Induced states in a decision tree constructed by Q-learning

Journal: :Inf. Sci. 2012

Kao-Shing Hwang Yu-Jen Chen Wei-Cheng Jiang Tsung-Wen Yang

Article history: Received 28 March 2009 Received in revised form 16 May 2012 Accepted 3 June 2012 Available online 9 June 2012

متن کامل

On-Line Connectionist Q-Learning Produces Unreliable Performance with A Synonym Finding Task

2000

Ian Johnson Mark D. Plumbley

متن کامل

A Survay of Reinforcement Learning Methods in the Windy and Cliff-walking Gridworlds

2005

Ryan J. Meuth

This report details the implementation of three Reinforcment learning methods, Monte Carlo, SARSA, and Q-Learning, and compares their performances in the Windy and CliffWalking Gridworlds.

متن کامل

Pruning an ensemble of classifiers via reinforcement learning

Journal: :Neurocomputing 2009

Ioannis Partalas Grigorios Tsoumakas Ioannis P. Vlahavas

This paper studies the problem of pruning an ensemble of classifiers from a Reinforcement Learning perspective. It contributes a new pruning approach that uses the Q-learning algorithm in order to approximate an optimal policy of choosing whether to include or exclude each classifier from the ensemble. Extensive experimental comparisons of the proposed approach against state-of-the-art pruning ...

متن کامل

Memory-guided Exploration in Reinforcement Learning

2001

James L. Carroll Todd S. Peterson Nancy E. Owens

The life-long learning architecture attempts to create an adaptive agent through the incorporation of prior knowledge over the lifetime of a learning agent. Our paper focuses on task transfer in reinforcement learning and specifically in Q-learning. There are three main model free methods for performing task transfer in Qlearning: direct transfer, soft transfer and memoryguided exploration. In ...

متن کامل

MDPs: Learning in Varying Environments

Journal: :Journal of Machine Learning Research 2002

István Szita Bálint Takács András Lörincz

In this paper ε-MDP-models are introduced and convergence theorems are proven using the generalized MDP framework of Szepesvári and Littman. Using this model family, we show that Q-learning is capable of finding near-optimal policies in varying environments. The potential of this new family of MDP models is illustrated via a reinforcement learning algorithm called event-learning which separates...

متن کامل

Classification with Costly Features using Deep Reinforcement Learning

Journal: :CoRR 2017

Jaromír Janisch Tomás Pevný Viliam Lisý

We study a classification problem where each feature can be acquired for a cost and the goal is to optimize the trade-off between classification precision and the total feature cost. We frame the problem as a sequential decision-making problem, where we classify one sample in each episode. At each step, an agent can use values of acquired features to decide whether to purchase another one or wh...

متن کامل

A Reinforcement Learning System for Transfer Scheduling of Freight Cars in a Train

2010

Yoichi Hirashima

In this paper, a Q-Learning method for trasfer scheduling of freight cars in a train is proposed. In the proposed method, the number of freight-movements in order to line freights in the desired order is reflected by evaluation value for each pair of freight-layout and removal-destination at a freight yard. The best transfer scheduling can be derived by selecting the removal-action of freight t...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید