q value

Smoothed Action Value Functions

2018

State-action value functions (i.e., Q-values) are ubiquitous in reinforcement learning (RL), giving rise to popular algorithms such as SARSA and Q-learning. We propose a new notion of action value defined by a Gaussian smoothed version of the expected Q-value. We show that such smoothed Q-values still satisfy a Bellman equation, making them learnable from experience sampled from an environment....

متن کامل

Squareness in the special L-value

2007

Amod Agashe

Let N be a prime and let A be a quotient of J0(N) over Q associated to a newform f such that the special L-value of A (at s = 1) is non-zero. Suppose that the algebraic part of special L-value is divisible by an odd prime q such that q does not divide the numerator of N−1 12 . Then the Birch and Swinnerton-Dyer conjecture predicts that q divides the algebraic part of special L value of A, as we...

متن کامل

0 v 3 2 2 Ju l 2 00 7 Scalar form - factors f ππ ( Q 2 ) and f KK ( Q 2 ) with light - cone QCD sum rules

2007

Zhi-Gang Wang

In this article, we calculate the scalar form-factors fππ(Q 2) and fKK(Q 2) in the framework of the light-cone QCD sum rules approach. The numerical value of the fππ(Q 2) changes quickly with variation ofQ2 near zero momentum transfer, while the fKK(Q 2) has rather good behavior at small momentum transfer. The value fKK(0) = 2.21 +0.35 −0.19GeV is compatible with the result from the leading ord...

متن کامل

VALUE DISTRIBUTION AND UNIQUENESS OF q-SHIFT DIFFERENCE POLYNOMIALS

2016

Pulak Sahoo

In this paper, we deal with the distribution of zeros of q-shift difference polynomials of transcendental entire functions of zero order. At the same time we also investigate the uniqueness problems when two difference products of entire functions share one value with finite weight. The results of the paper improve and generalize some recent results due to Xu, Liu and Cao [Math. Commun. 20 (201...

متن کامل

Q-value Heuristics for Approximate Solutions of Dec-POMDPs

2007

Frans A. Oliehoek Nikos A. Vlassis

The Dec-POMDP is a model for multi-agent planning under uncertainty that has received increasingly more attention over the recent years. In this work we propose a new heuristic QBG that can be used in various algorithms for Dec-POMDPs and describe differences and similarities with QMDP and QPOMDP. An experimental evaluation shows that, at the price of some computation, QBG gives a consistently ...

متن کامل

Determining optimal medical image compression: psychometric and image distortion analysis

2012

Alexander C. Flint

BACKGROUND Storage issues and bandwidth over networks have led to a need to optimally compress medical imaging files while leaving clinical image quality uncompromised. METHODS To determine the range of clinically acceptable medical image compression across multiple modalities (CT, MR, and XR), we performed psychometric analysis of image distortion thresholds using physician readers and also ...

متن کامل

The Advantage Learning Operator

2016

Greg Farquhar

Value-based reinforcement learning typically involves the repeated application of an update rule, such as the Bellman operator TB, to an action-value function. Recent work has explored the use of alternative operators, which remain optimality-preserving and may result in improved performance. In this report, I study in particular the advantage learning operator, TALQ = TBQ − α(V − Q). A theoret...

متن کامل

Damping of oscillations in the semiquinone absorption in reaction centers after successive flashes determination of the equilibrium between Q(-)AQB and QAQ(-)B.

Journal: :Biochimica et biophysica acta 1984

D Kleinfeld E C Abresch M Y Okamura G Feher

A quantitative model for the damping of oscillations of the semiquinone absorption after successive light flashes is presented. It is based on the equilibrium between the states Q(A)-Q(B) and Q(A) Q(-B). A fit of the model to the experimental results obtained for reaction centers from Rhodopseudomonas sphaeroides gave a value of α = [Q(A)-Q(B)I/(IQ(A)-Q(Bl)+ [Q(A)Q(-B)I) = 0.065 +/- 0.005 (T= 2...

متن کامل

positive solution for boundary value problem of fractional dierential equation

Journal: :نظریه تقریب و کاربرد های آن 0

sh rezaei department of mathematic, islamic azad university, aligudarz branch, aligudarz, lorestan, iran.

in this paper, we prove the existence of the solution for boundary value prob-lem(bvp) of fractional dierential equations of order q 2 (2; 3]. the kras-noselskii's xed point theorem is applied to establish the results. in addition,we give an detailed example to demonstrate the main result.

متن کامل

Subgradients of value functions in parametric dynamic programming q

2016

B. T. Kien Y. C. Liou J.-C. Yao

We study in this paper the first-order behavior of value functions in parametric dynamic programming with linear constraints and nonconvex cost functions. By establishing an abstract result on the Fréchet subdifferential of value functions of parametric mathematical programming problems, some new formulas on the Fréchet subdifferential of value functions in parametric dynamic programming are ob...

متن کامل