نتایج جستجو برای: temporal difference learning
تعداد نتایج: 1222164 فیلتر نتایج به سال:
Kernel methods have attracted many research interests recently since by utilizing Mercer kernels, non-linear and non-parametric versions of conventional supervised or unsupervised learning algorithms can be implemented and usually better generalization abilities can be obtained. However, kernel methods in reinforcement learning have not been popularly studied in the literature. In this paper, w...
We introduce a class of Multigrid based temporal difference algorithms for reinforcement learning with linear function approximation. Multigrid methods are commonly used to accelerate convergence of iterative numerical computation algorithms. The proposed Multigrid-enhanced TD(λ) algorithms allows to accelerate the convergence of the basic TD(λ) algorithm while keeping essentially the same per-...
Reinforcement learning has been used for training game playing agents. The value function for a complex game must be approximated with a continuous function because the number of states becomes too large to enumerate. Temporal-difference learning with self-play is one method successfully used to derive the value approximation function. Coevolution of the value function is also claimed to yield ...
Learning to play the game Tetris has been a common challenge on a few past machine learning competitions. Most have achieved tremendous success by using a heuristic cost function during the game and optimizing each move based on those heuristics. This approach dates back to 1990’s when it has been shown to eliminate successfully a couple hundred lines. However, we decided to completely depart f...
This paper presents a novel representational framework for the Temporal Difference (TD) model of learning, which allows the computation of configural stimuli – cumulative compounds of stimuli that generate perceptual emergents known as configural cues. This Simultaneous and Serial Configural-cue Compound Stimuli Temporal Difference model (SSCC TD) can model both simultaneous and serial stimulus...
In this paper, we study the Temporal Difference (TD) learning with linear value function approximation. It is well known that most TD learning algorithms are unstable with linear function approximation and off-policy learning. Recent development of Gradient TD (GTD) algorithms has addressed this problem successfully. However, the success of GTD algorithms requires a set of well chosen features,...
This paper introduces locally weighted temporal difference learning for evaluation of a class of policies whose value function is nonlinear in the state. Least squares temporal difference learning is used for training local models according to a distance metric in state-space. Empirical evaluations are reported demonstrating learning performance on a number of strongly non-linear value function...
BACKGROUND The neurobiology of bulimia nervosa (BN) is poorly understood. Recent animal literature suggests that binge eating is associated with altered brain dopamine (DA) reward function. In this study, we wanted to investigate DA-related brain reward learning in BN. METHODS Ill BN (n = 20, age: mean = 25.2, SD = 5.3 years) and healthy control women (CW) (n = 23, age: mean = 27.2, SD = 6.4 ...
Temporal Difference learning or TD(λ) is a fundamental algorithm in the field of reinforcement learning. However, setting TD’s λ parameter, which controls the timescale of TD updates, is generally left up to the practitioner. We formalize the λ selection problem as a bias-variance trade-off where the solution is the value of λ that leads to the smallest Mean Squared Value Error (MSVE). To solve...
Off-policy reinforcement learning has many applications including: learning from demonstration, learning multiple goal seeking policies in parallel, and representing predictive knowledge. Recently there has been an proliferation of new policyevaluation algorithms that fill a longstanding algorithmic void in reinforcement learning: combining robustness to offpolicy sampling, function approximati...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید