نتایج جستجو برای: q learning
تعداد نتایج: 717428 فیلتر نتایج به سال:
In this paper, an update method of Q-value is proposed to increase the learning rate of Q-learning. When Q-value of executed action is small, even if it is an optimal action, the learning becomes longer because the frequency to be executed again becomes lower. The proposed method increased the execution frequency of optimal action by forcefully increasing the Q-value through the Q-value update ...
We study learning in Minority Games (MG) with multiple resources. The MG is a repeated conflicting interest game involving a large number of agents. So far, the learning mechanisms studied were rather naive and involved only exploitation of the best strategy at the expense of exploring new strategies. Instead, we use a reinforcement learning method called Q-learning and show how it improves the...
Given an environment with continuous state spaces and discrete actions, we investigate using a Double Deep Q-learning Reinforcement Agent to find optimal policies using the LunarLander-v2 OpenAI gym environment.
This paper addresses the problem of predicting popularity of comments in an online discussion forum using reinforcement learning, particularly addressing two challenges that arise from having natural language state and action spaces. First, the state representation, which characterizes the history of comments tracked in a discussion at a particular point, is augmented to incorporate the global ...
This paper proposes a novel method for supervised classification based on the methodology of Q-analysis. The classification is based on finding ‘relevant’ structures in the features describing the data, and using them to define each of the classes. The features not included in the structural definition of a class are considered as ‘irrelevant’. The paper uses three different data-sets to experi...
While most Reinforcement Learning work utilizes temporal discounting to evaluate performance, the reasons for this are unclear. Is it out of desire or necessity? We argue that it is not out of desire, and seek to dispel the notion that temporal discounting is necessary by proposing a framework for undiscounted optimization. We present a metric of undiscounted performance and an algorithm for fi...
Markov games are a framework which formalises n-agent reinforcement learning. For instance, Littman proposed the minimax-Q algorithm to model two-agent zero-sum problems. This paper proposes a new simple algorithm in this framework, QL2, and compares it to several standard algorithms (Q-learning, Minimax and minimax-Q). Experiments show that QL2 converges to optimal mixed policies, as minimax-Q...
We propose the Artificial Continuous Prediction Market (ACPM) as a means to predict a continuous real value, by integrating a range of data sources and aggregating the results of different machine learning (ML) algorithms. ACPM adapts the concept of the (physical) prediction market to address the prediction of real values instead of discrete events. Each ACPM participant has a data source, a ML...
|Several methods have been proposed in the reinforcement learning literature for learning optimal policies for sequential decision tasks. Q-learning is a model-free algorithm that has recently been applied to the Acrobot, a two-link arm with a single actuator at the elbow that learns to swing its free endpoint above a target height. However, applying Q-learning to a real Acrobot may be impracti...
anxiety disorders are the most common reasons for referring to specialized clinics. if the response to stress changed, anxiety can be greatly controlled. the most obvious effect of stress occurs on circulatory system especially through sweating. the electrical conductivity of skin or in other words galvanic skin response (gsr) which is dependent on stress level is used; beside this parameter pe...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید