نتایج جستجو برای: q value
تعداد نتایج: 842664 فیلتر نتایج به سال:
this paper deals with the boundary value problem involving the differential equationbegin{equation*} ell y:=-y''+qy=lambda y, end{equation*} subject to the standard boundary conditions along with the following discontinuity conditions at a point $ain (0,pi)$ begin{equation*} y(a+0)=a_1 y(a-0),quad y'(a+0)=a_1^{-1}y'(a-0)+a_2 y(a-0),end{equation*}where $q(x), a_1 , a_2$ are real, $qin l...
and Applied Analysis 3 and for a 0, we denote Iqf x ∫x 0 f t dqt ∞ ∑ n 0 x ( 1 − qqnfxqn, 2.4 provided the series converges. If a ∈ 0, b and f is defined in the interval 0, b , then ∫b a f t dqt ∫b 0 f t dqt − ∫a 0 f t dqt. 2.5 Similarly, we have I0 qf t f t , I n q f t IqI n−1 q f t , n ∈ . 2.6
Consider a sequence of n independent observations from a population of increasing size αi, i = 1,2,... and an absolutely continuous initial distribution function. The distribution of the kth record value is represented as a countable mixture, with mixing the distribution of the kth record time and mixed the distribution of the nth order statistic. Precisely, the distribution function and (pow...
How do social networks motivate people to connect not only to their previously existing friends but also to novel or blind new contacts? We report the results of an experiment to identify the value that participants give to alternative network characteristics when deciding to connect to a social network. We focus on network tie characteristics because they represent information that potentially...
Hierarchical state decompositions address the curse-ofdimensionality in Q-learning methods for reinforcement learning (RL) but can suffer from suboptimality. In addressing this, we introduce the Economic Hierarchical Q-Learning (EHQ) algorithm for hierarchical RL. The EHQ algorithm uses subsidies to align interests such that agents that would otherwise converge to a recursively optimal policy w...
In order for researchers to understand and predict behavior, they must consider both person and situation factors and how these factors interact. Even though organization researchers have developed interactional models, many have overemphasized either person or situation components, and most have failed to consider the effects that persons have on situations. This paper presents criteria for im...
Q-learning is a popular reinforcement learning algorithm, but it can perform poorly in stochastic environments due to overestimating action values. Overestimation is due to the use of a single estimator that uses the maximum action value as an approximation for the maximum expected action value. To avoid overestimation in Qlearning, the double Q-learning algorithm was recently proposed, which u...
Yishay Mansourt Vie sho,v the convergence of tV/O deterministic variants of Qlearning. The first is the widely used optimistic Q-learning, which initializes the Q-values to large initial values and then follows a greedy policy with respect to the Q-values. We show that setting the initial value sufficiently large guarantees the converges to an Eoptimal policy. The second is a new and novel algo...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید