نتایج جستجو برای: q algorithm

تعداد نتایج: 863118  

2007
Yang Cheng F. Landis Markley John L. Crassidis Yaakov Oshman

This paper presents an algorithm to average a set of quaternion observations. The average quaternion is determined by minimizing the weighted sum of the squared Frobenius norms of the corresponding attitude matrix differences, subject to the unit-norm constraint in the determined solution. Two cases are presented: one that incorporates scalar weights and one that incorporates general weights on...

2012
Ulit Jaidee Héctor Muñoz-Avila

We present CLASSQ-L (for: class Q-learning) an application of the Q-learning reinforcement learning algorithm to play complete Wargus games. Wargus is a real-time strategy game where players control armies consisting of units of different classes (e.g., archers, knights). CLASSQ-L uses a single table for each class of unit so that each unit is controlled and updates its class’ Qtable. This enab...

2014
James MacGlashan Michael L. Littman Fiery Cushman

Existing models of the evolution of social behavior typically involve innate strategies such as tit-for-tat. Yet, both behavioral and neural evidence indicates a substantial role for learned social behavior. We explore the evolutionary dynamics of two simple social behaviors among learning agents: Theft and punishment. In our simulation, agents employ Q-learning, a common reinforcement learning...

Journal: :CoRR 2017
Richard Y. Chen Szymon Sidor Pieter Abbeel John Schulman

We show how an ensemble ofQ-functions can be leveraged for more effective exploration in deep reinforcement learning. We build on well established algorithms from the bandit setting, and adapt them to the Q-learning setting. We propose an exploration strategy based on upper-confidence bounds (UCB). Our experiments show significant gains on the Atari benchmark.

2010
Y. B. Reddy

The efficient utilization of underutilized spectrum is the main theme of current research. The cognitive radio with the help of Q-learning algorithm is used to detect the presence of primary signals and utilize the spectrum in the absence of primary signals. The proposed Q-learning algorithm model identifies previously known signals and learns to detect the signals which otherwise could not be ...

2011
Koichi Moriyama Satoshi Kurihara Masayuki Numao

We have proposed the utility-based Q-learning concept that supposes an agent internally has an emotional mechanism that derives subjective utilities from objective rewards and the agent uses the utilities as rewards of Q-learning. We have also proposed such an emotional mechanism that facilitates cooperative actions in Prisoner’s Dilemma (PD) games. However, this mechanism has been designed and...

2004
D. Blatt S. A. Murphy

Abstract We consider a new algorithm for reinforcement learning called A-learning. A-learning learns the advantages from a single training set. We compare A-learning with function approximation to Q-learning with function approximation and find that because A-learning approximates only the advantages it is less likely to exhibit bias due to the function approximation as compared to Q-learning.W...

Journal: :CoRR 2013
Meirav Zehavi

We present three deterministic parameterized algorithms for well-studied packing and matching problems, namely, Weighted q-Dimensional p-Matching ((q, p)-WDM) and Weighted qSet p-Packing ((q, p)-WSP). More specifically, we present an O(2.85043) time deterministic algorithm for (q, p)-WDM, an O(8.04143) time deterministic algorithm for the unweighted version of (3, p)-WDM, and an O((0.56201 · 2....

Journal: :Computers & Mathematics with Applications 1976

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید