نتایج جستجو برای: temporal difference learning
تعداد نتایج: 1222164 فیلتر نتایج به سال:
This paper describes improvements to the temporal difference learning method. The standard form of the method has the problem that two control parameters, learning rate and temporal discount, need to be chosen appropriately. These parameters can have a major effect on performance, particularly the learning rate parameter, which affects the stability of the process as well as the number of obser...
In this paper we discuss the problem of automatically learning evaluation function parameters in a chess program. In particular, we describe some experiments in which our chess program KnightCap learnt the parameters of its evaluation function using a combination of Temporal Difference learning and on-line play on FICS and ICC. KnightCap is freely available on the web from http://wwwsyseng.anu....
In this article a neural network architecture is presented that is able to build a soft segmentation of a two-dimensional input. This network architecture is applied to position evaluation in the game of Go. It is trained using self-play and temporal difference learning combined with a rich two-dimensional reinforcement signal. Two experiments are performed, one using the raw board position as ...
It is now almost impossible to deal with spatial data without considering some explicit specification that captures possible spatial effects. One valuable feature of spatial econometrics models is their decomposition of marginal effects into spatial spillover effect and spatial externalities. Progress in interpreting spatial econometrics models has now been extended to the spatial-panel case. H...
learning-oriented assessment seeks to emphasise that a fundamental purpose of assessment should be to promote learning. it mirrors formative assessment and assessment for learning processes. it can be defined as actions undertaken by teachers and / or students, which provide feedback for the improvement of teaching and learning. it also contrasts with equally important measurement-focused appro...
A promising approach to learn to play board games is to use reinforcement learning algorithms that can learn a game position evaluation function. In this paper we examine and compare three different methods for generating training games: 1) Learning by self-play, 2) Learning by playing against an expert program, and 3) Learning from viewing experts play against each other. Although the third po...
Imitation learning is the study of learning how to act given a set of demonstrations provided by a human expert. It is intuitively apparent that learning to take optimal actions is a simpler undertaking in situations that are similar to the ones shown by the teacher. However, imitation learning approaches do not tend to use this insight directly. In this paper, we introduce State Aware Imitatio...
Reinforcement learning has its origin from the animal learning theory. RL does not require prior knowledge but can autonomously get optional policy with the help of knowledge obtained by trial-and-error and continuously interacting with the dynamic environment. Due to its characteristics of self improving and online learning, reinforcement learning has become one of intelligent agent’s core tec...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید