نتایج جستجو برای: q policy
تعداد نتایج: 381585 فیلتر نتایج به سال:
It is increasingly acknowledged that many threats to an organisation’s computer systems can be attributed to the behaviour of computer users. To quantify these human-based information security vulnerabilities, we are developing the Human Aspects of Information Security Questionnaire (HAIS-Q). The aim of this paper was twofold. The first aim was to outline the conceptual development of the HAIS-...
This paper presents a new method to learn online policies in continuous state, continuous action, model-free Markov decision processes, with two properties that are crucial for practical applications. First, the policies are implementable with a very low computational cost: once the policy is computed, the action corresponding to a given state is obtained in logarithmic time with respect to the...
– All-unit order frequency discount • 94%-effective policy for a variable base period – Incremental order frequency discount • 94%-effective policy for a fixed base period • 98%-effective policy for a variable base period Introduction • A supply chain is a network of facilities that procure raw materials, transform them into intermediate goods and then final products, and deliver the products t...
As powerful probabilistic models for optimal policy search, partially observable Markov decision processes (POMDPs) still suffer from the problems such as hidden state and uncertainty in action effects. In this paper, a novel approximate algorithm Genetic algorithm based Q-Learning (GAQ-Learning), is proposed to solve the POMDP problems. In the proposed methodology, genetic algorithms maintain ...
This paper presents a new problem solving approach that is able to generate optimal policy solution for finite-state stochastic sequential decision-making problems with high data efficiency. The proposed algorithm iteratively builds and improves an approximate Markov Decision Process (MDP) model along with cost-to-go value approximates by generating finite length trajectories through the state-...
This paper presents a new approach to hierarchical reinforcement learning based on decomposing the target Markov decision process (MDP) into a hierarchy of smaller MDPs and decomposing the value function of the target MDP into an additive combination of the value functions of the smaller MDPs. The decomposition, known as the MAXQ decomposition , has both a procedural semantics|as a subroutine h...
This paper presents a new approach to hierarchical reinforcement learning based on decomposing the target Markov decision process (MDP) into a hierarchy of smaller MDPs and decomposing the value function of the target MDP into an additive combination of the value functions of the smaller MDPs. The decomposition, known as the MAXQ decomposition, has both a procedural semantics—as a subroutine hi...
We present Q-functionals, an alternative architecture for continuous control deep reinforcement learning. Instead of returning a single value state-action pair, our network transforms state into function that can be rapidly evaluated in parallel many actions, allowing us to efficiently choose high-value actions through sampling. This contrasts with the typical off-policy control, where policy i...
This paper provides an analysis of error propagation in Approximate Dynamic Programming applied to zero-sum two-player Stochastic Games. We provide a novel and unified error propagation analysis in Lp-norm of three well-known algorithms adapted to Stochastic Games (namely Approximate Value Iteration, Approximate Policy Iteration and Approximate Generalized Policy Iteratio,n). We show that we ca...
Background: Students sleep pattern, due to the stress of studying and teaching workload are different with other non-student peers. The aim of this study was to determine the prevalence of poor sleep quality in college students of Iran by a meta-analysis study, to be as a final measure for policy makers in this field. Methods: In this meta-analysis study, the databases of PubMed, Science Direct...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید