نتایج جستجو برای: q algorithm
تعداد نتایج: 863118 فیلتر نتایج به سال:
Multi-objective reinforcement learning involves the use of techniques to address problems with multiple objectives. To resolve this, we a hybrid multi-objective optimization method that provides mathematical guarantee all policies belonging Pareto Front can be found. The hybridization gave rise Q-Managed, which is given by ε−constraint and Q-Learning algorithm, where first limits environment dy...
Jianqin Zhou (Dept. of Computer Science, Anhui University of Technology, Ma’anshan 243002, P. R. China) (E-mail: [email protected]) Abstract: A fast algorithm is presented for determining the linear complexity and the minimal polynomial of periodic sequences over GF(q) with period q n p m , where p is a prime, q is a prime and a primitive root modulo p. The algorithm presented here generalizes...
Ensemble algorithms can improve the performance of a given learning algorithm through the combination of multiple base classifiers into an ensemble. In this paper we attempt to train and combine the base classifiers using an adaptive policy. This policy is learnt through a Q-learning inspired technique. Its effectiveness for an essentially supervised task is demonstrated by experimental results...
Harmonic identification by using Adaptive Tabu Search (ATS) Method embedded in the active power filter is proposed in this paper. The use of the ATS identifies harmonic components more accurately and precisely. Besides the accuracy and precision, it is able to select only some particular harmonic orders that cause severe consequences to the system for elimination. This principle thus leads to t...
While most Reinforcement Learning work utilizes temporal discounting to evaluate performance, the reasons for this are unclear. Is it out of desire or necessity? We argue that it is not out of desire, and seek to dispel the notion that temporal discounting is necessary by proposing a framework for undiscounted optimization. We present a metric of undiscounted performance and an algorithm for fi...
Markov games are a framework which formalises n-agent reinforcement learning. For instance, Littman proposed the minimax-Q algorithm to model two-agent zero-sum problems. This paper proposes a new simple algorithm in this framework, QL2, and compares it to several standard algorithms (Q-learning, Minimax and minimax-Q). Experiments show that QL2 converges to optimal mixed policies, as minimax-Q...
We propose the Artificial Continuous Prediction Market (ACPM) as a means to predict a continuous real value, by integrating a range of data sources and aggregating the results of different machine learning (ML) algorithms. ACPM adapts the concept of the (physical) prediction market to address the prediction of real values instead of discrete events. Each ACPM participant has a data source, a ML...
Temporal difference algorithms perform well on discrete and small problems. This paper proposes a modification of the Q-learning algorithm towards natural ability to receive a feature list instead of an already identified state in the input. Complete observability is still assumed. The algorithm, Naive Augmenting Q-Learning, has been designed through building a hierarchical structure of input f...
|Several methods have been proposed in the reinforcement learning literature for learning optimal policies for sequential decision tasks. Q-learning is a model-free algorithm that has recently been applied to the Acrobot, a two-link arm with a single actuator at the elbow that learns to swing its free endpoint above a target height. However, applying Q-learning to a real Acrobot may be impracti...
Theorem 12.2. Let p and q be prime divisors of N , and let `p and `q be the largest prime divisors of p− 1 and q− 1, respectively. If `p ≤ B and `p < `q then Algorithm 12.1 succeeds with probability at least 1− 1 `q . Proof. If a ≡ 0 mod p then the algorithm succeeds in step 2, so we may assume a ⊥ p. When the algorithm reaches ` = `p in step 3 we have b = a m, where m = ∏ `≤`p ` e is a multipl...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید