نتایج جستجو برای: reward processes
تعداد نتایج: 554393 فیلتر نتایج به سال:
The distribution theory for reward functions on semi-Markov processes has been of interest since the early 1960s. The relevant asymptotic distribution theory has been satisfactorily developed. On the other hand, it has been noticed that it is difficult to find exact distribution results which lead to the effective computation of such distributions. Note that there is no satisfactory exact distr...
Adolescents often respond differently than adults to the same salient motivating contexts, such as peer interactions and pleasurable stimuli. Delineating the neural processing differences of adolescents is critical to understanding this phenomenon, as well as the bases of serious behavioral and psychiatric vulnerabilities, such as drug abuse, mood disorders, and schizophrenia. We believe that a...
We consider discrete time, finite state space Markov rewaxd processes which depend on a set of parameters. Previously, we proposed a simulation-based methodology to tune the parameters to optimize the average reward. The resulting algorithms converge with probability 1, but may have a high variance. Here we propose two approaches to reduce the variance, which however introduce a new bias into t...
This paper studies a discrete-time total-reward Markov decision process (MDP) with a given initial state distribution. A (randomized) stationary policy can be split on a given set of states if the occupancy measure of this policy can be expressed as a convex combination of the occupancy measures of stationary policies, each selecting deterministic actions on the given set and coinciding with th...
We consider how state similarity in average reward Markov decision processes (MDPs) may be described by pseudometrics. Introducing the notion of adequate pseudometrics which are well adapted to the structure of the MDP, we show how these may be used for state aggregation. Upper bounds on the loss that may be caused by working on the aggregated instead of the original MDP are given and compared ...
Sequential decision problems can often be modeled as Markov decision processes. Classical solution approaches assume that the parameters of the model are known. However, model parameters are usually estimated and uncertain in practice. As a result, managers are often interested in how estimation errors affect the optimal solution. In this paper we illustrate how sensitivity analysis can be perf...
• A submitted manuscript is the author's version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website. • The final author version ...
As an extension of the discrete-time case, this note investigates the variance of the total cumulative reward for the embedded Markov chain of semi-Markov processes. Under the assumption that the chain is aperiodic and contains a single class of recurrent states recursive formulae for the variance are obtained which show that the variance growth rate is asymptotically linear in time. Expression...
The discovery that delta-9-tetrahydrocannabinol (Δ(9)-THC) is the primary psychoactive ingredient in marijuana prompted research that helped elucidate the endogenous cannabinoid system of the brain. Δ(9)-THC and other cannabinoid ligands with agonist action (CP 55,940, HU210, and WIN 55,212-2) increase firing of dopamine neurons and increase synaptic dopamine in brain regions associated with re...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید