نتایج جستجو برای: q learning

تعداد نتایج: 717428  

2003
Chang Deng Meng Joo Er

This paper presents a Dynamic Fuzzy Q-Learning (DFQL) method that is capable of tuning the Fuzzy Inference Systems (FIS) online. On-line self-organizing learning is developed so that structure and parameters identification are accomplished automatically and simultaneously. Selforganizing fuzzy inference is introduced to calculate actions and Q-functions so as to enable us to deal with continuou...

1998
John E. Moody Matthew Saffell

We propose to train trading systems by optimizing financial objective functions via reinforcement learning. The performance functions that we consider as value functions are profit or wealth, the Sharpe ratio and our recently proposed differential Sharpe ratio for online learning. In Moody & Wu (1997), we presented empirical results in controlled experiments that demonstrated the advantages of ...

Journal: :Adaptive Behaviour 2014
Mehmet Dinçer Erbas Alan F. T. Winfield Larry Bull

Imitation is an example of social learning in which an individual observes and copies another’s actions. This paper presents a new method for using imitation as a way of enhancing the learning speed of individual agents that employ a well-known reinforcement learning algorithm, namely Q-learning. Compared to other research that uses imitation with reinforcement learning, our method uses imitati...

2008
Dan Erusalimchik Gal A. Kaminka

In the research area of multi-robot systems, several researchers have reported on consistent success in using heuristic measures to improve loose coordination in teams, by minimizing coordination costs using various heuristic techniques. While these heuristic methods has proven successful in several domains, they have never been formalized, nor have they been put in context of existing work on ...

2008
Laëtitia Matignon Guillaume J. Laurent Nadine Le Fort-Piat

The article focuses on decentralized reinforcement learning (RL) in cooperative multi-agent games, where a team of independent learning agents (ILs) try to coordinate their individual actions to reach an optimal joint action. Within this framework, some algorithms based on Q-learning are proposed in recent works. Especially, we are interested in Distributed Q-learning which finds optimal polici...

2000
Martin V. Butz David E. Goldberg Wolfgang Stolzmann

In contrast to common Learning Classiier Systems (LCSs), classiiers in the Anticipatory Classiier System (ACS) have a condition-action-anticipation-payoo structure (Stolzmann, 1998). The learning is based on the accuracy of predicted environmental eeects (i.e. anticipations) rather than on the payoo predictions, as in traditional LCSs, or the accuracy of payoo predictions, as in XCS (Wilson, 19...

Journal: :CoRR 2018
Konstantin Böttinger Patrice Godefroid Rishabh Singh

Fuzzing is the process of finding security vulnerabilities in input-processing code by repeatedly testing the code with modified inputs. In this paper, we formalize fuzzing as a reinforcement learning problem using the concept of Markov decision processes. This in turn allows us to apply state-of-theart deep Q-learning algorithms that optimize rewards, which we define from runtime properties of...

2001
Michael L. Littman

This paper describes an approach to reinforcement learning in multiagent general-sum games in which a learner is told to treat each other agent as either a \friend" or \foe". This Q-learning-style algorithm provides strong convergence guarantees compared to an existing Nash-equilibrium-based learning rule.

2005
Adrian Agogino Kagan Tumer

Enabling reinforcement learning to be effective in large-scale multi-agent Markov Decisions Problems is a challenging task. To address this problem we propose a multi-agent variant of Q-learning: “Q Updates with Immediate Counterfactual Rewards-learning” (QUICR-learning). Given a global reward function over all agents that the large-scale system is trying to maximize, QUICR-learning breaks down...

2013
Rashmi Sharma Manish Prateek Ashok K. Sinha

Reinforcement learning has its origin from the animal learning theory. RL does not require prior knowledge but can autonomously get optional policy with the help of knowledge obtained by trial-and-error and continuously interacting with the dynamic environment. Due to its characteristics of self improving and online learning, reinforcement learning has become one of intelligent agent’s core tec...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید