نتایج جستجو برای: q algorithm

تعداد نتایج: 863118  

2017
Markus Dumke

Temporal-difference (TD) learning is an important field in reinforcement learning. Sarsa and Q-Learning are among the most used TD algorithms. The Q(σ) algorithm (Sutton and Barto (2017)) unifies both. This paper extends the Q(σ) algorithm to an online multi-step algorithm Q(σ, λ) using eligibility traces and introduces Double Q(σ) as the extension of Q(σ) to double learning. Experiments sugges...

Journal: :Cybernetics and Information Technologies 2015

Journal: :CoRR 2006
Henrik Bäärnhielm

Under the assumption of a certain conjecture, for which there exists strong experimental evidence, we produce an efficient algorithm for constructive membership testing in the Suzuki groups Sz(q), where q = 2 for some m > 0, in their natural representations of degree 4. It is a Las Vegas algorithm with running time O(log(q)) field operations, and a preprocessing step with running time O(log(q) ...

Journal: :Computer Systems Science and Engineering 2021

ژورنال: کنترل 2022

In this paper, a model-free reinforcement learning-based controller is designed to extract a treatment protocol because the design of a model-based controller is complex due to the highly nonlinear dynamics of cancer. The Q-learning algorithm is used to develop an optimal controller for cancer chemotherapy drug dosing. In the Q-learning algorithm, each entry of the Q-table is updated using data...

Journal: :Electronic Journal of Probability 2013

Journal: :CoRR 2018
Shujaat Khan Alishba Sadiq Imran Naseem Roberto Togneri Mohammed Bennamoun

In this work, a new class of stochastic gradient algorithm is developed based on q-calculus. Unlike the existing q-LMS algorithm, the proposed approach fully utilizes the concept of q-calculus by incorporating time-varying q parameter. The proposed enhanced q-LMS (Eq-LMS) algorithm utilizes a novel, parameterless concept of error-correlation energy and normalization of signal to ensure high con...

2001
Eyal Even-Dar Yishay Mansour

Yishay Mansourt Vie sho,v the convergence of tV/O deterministic variants of Qlearning. The first is the widely used optimistic Q-learning, which initializes the Q-values to large initial values and then follows a greedy policy with respect to the Q-values. We show that setting the initial value sufficiently large guarantees the converges to an Eoptimal policy. The second is a new and novel algo...

2013
Neil O’Connell Yuchen Pei

We introduce a q-weighted version of the Robinson-Schensted (column insertion) algorithm which is closely connected to q-Whittaker functions (or Macdonald polynomials with t = 0) and reduces to the usual Robinson-Schensted algorithm when q = 0. The q-insertion algorithm is ‘randomised’, or ‘quantum’, in the sense that when inserting a positive integer into a tableau, the output is a distributio...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید