نتایج جستجو برای: reinforcement learning
تعداد نتایج: 619520 فیلتر نتایج به سال:
Research in decision-making has focused on the role of dopamine and its striatal targets in guiding choices via learned stimulus-reward or stimulus-response associations, behavior that is well described by reinforcement learning theories. However, basic reinforcement learning is relatively limited in scope and does not explain how learning about stimulus regularities or relations may guide deci...
A Condorcet social choice procedure elects the candidate that beats every other candidate under simple majority when such a candidate exists. The reinforcement axiom roughly states that given two groups of individuals, if these two groups select the same alternative, then this alternative must also be selected by their union. Condorcet social choice procedures are known to violate this axiom. O...
We extend EWA learning to games in which only the set of possible foregone payo®s from unchosen strategies are known. We assume players estimate unknown foregone payo®s from a strategy, by substituting the last payo® actually received from that strategy, or by clairvoyantly guessing the actual foregone payo®. Either assumption improves predictive accuracy of EWA. Learning parameters are also es...
useful, please do cite my book (for which this material was prepared), now in its second edition.
Computational models of learning have proved largely successful in characterizing potential mechanisms which allow humans to make decisions in uncertain and volatile contexts. We report here findings that extend existing knowledge and show that a modified reinforcement learning model, which has separate parameters according to whether the previous trial gave a reward or a punishment, can provid...
This survey considers response generating systems that improve their behaviour using reinforcement learning. The di erence between unsupervised learning, supervised learning, and reinforcement learning is described. Two general problems concerning learning systems are presented; the credit assignment problem and the problem of perceptual aliasing. Notations and some general issues concerning re...
We propose to train trading systems by optimizing financial objective functions via reinforcement learning. The performance functions that we consider as value functions are profit or wealth, the Sharpe ratio and our recently proposed differential Sharpe ratio for online learning. In Moody & Wu (1997), we presented empirical results in controlled experiments that demonstrated the advantages of ...
In order to improve the bandwidth allocation considering feedback of operational environment, adaptable bandwidth planning based on reinforcement learning is proposed. The approach is based on new constrained scheduling algorithms controlled by reinforcement learning techniques. Different constrained scheduling algorithms,, such as “conflict free scheduling with minimum duration”, “partial disp...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید