نتایج جستجو برای: مدل reward beta
تعداد نتایج: 336113 فیلتر نتایج به سال:
BACKGROUND The value of a predicted reward can be estimated based on the conjunction of both the intrinsic reward value and the length of time to obtain it. The question we addressed is how the two aspects, reward size and proximity to reward, influence the responses of neurons in rostral anterior cingulate cortex (rACC), a brain region thought to play an important role in reward processing. ...
Recent findings have demonstrated that reward feedback alone can drive motor learning. However, it is not yet clear whether reward feedback alone can lead to learning when a perturbation is introduced abruptly, or how a reward gradient can modulate learning. In this study, we provide reward feedback that decays continuously with increasing error. We asked whether it is possible to learn an abru...
Reward sensitivity or the tendency to engage in motivated approach behavior in the presence of rewarding stimuli, may be a contributing factor to vulnerability to disinhibitory behaviors. While evidence exists for a reward sensitivity-related increased response in reward brain areas (i.e., nucleus accumbens or midbrain) during the processing of reward cues, it is unknown how this trait modulate...
Simulating mood within a decision making process has been shown to allow cooperation to occur within the Prisoner’s Dilemma. In this paper we propose how to integrate a mood model into the classical reinforcement learning algorithm Sarsa, and show how this addition can allow self-interested agents to be successful within a multi agent environment. The human-inspired moody agent will learn to co...
This paper proposes an online transfer framework to capture the interaction among agents and shows that current transfer learning in reinforcement learning is a special case of online transfer. Furthermore, this paper re-characterizes existing agents-teaching-agents methods as online transfer and analyze one such teaching method in three ways. First, the convergence of Qlearning and Sarsa with ...
Reinforcement learning (RL) is a machine learning technique for sequential decision making. This approach is well proven in many small-scale domains. The true potential of this technique cannot be fully realised until it can adequately deal with the large domain sizes that typically describe real world problems. RL with function approximation is one method of dealing with the domain size proble...
We propose expected policy gradients (EPG), which unify stochastic policy gradients (SPG) and deterministic policy gradients (DPG) for reinforcement learning. Inspired by expected sarsa, EPG integrates (or sums) across actions when estimating the gradient, instead of relying only on the action in the sampled trajectory. For continuous action spaces, we first derive a practical result for Gaussi...
In order to scale to problems with large or continuous state-spaces, reinforcement learning algorithms need to be combined with function approximation techniques. The majority of work on function approximation for reinforcement learning has so far focused either on global function approximation with a static structure (such as multi-layer perceptrons), or on constructive architectures using loc...
The TemporalMobile Stochastic Logic (MOSL) has been introduced in previous work by the authors for formulating properties of systems specified in STOKLAIM, a Markovian extension of KLAIM. The main purpose of MOSL is to address key functional aspects of global computing such as distribution awareness, mobility, and security and their integration with performance and dependability guarantees. In ...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید