نتایج جستجو برای: stochastic gradient descent learning
تعداد نتایج: 840759 فیلتر نتایج به سال:
Iterative procedures in stochastic optimization are typically comprised of a transient phase and a stationary phase. During the transient phase the procedure converges towards a region of interest, and during the stationary phase the procedure oscillates in a convergence region, commonly around a single point. In this paper, we develop a statistical diagnostic test to detect such phase transiti...
This paper presents an online support vector machine (SVM) that uses the stochastic meta-descent (SMD) algorithm to adapt its step size automatically. We formulate the online learning problem as a stochastic gradient descent in reproducing kernel Hilbert space (RKHS) and translate SMD to the nonparametric setting, where its gradient trace parameter is no longer a coefficient vector but an eleme...
In this paper we propose a novel approach to automatically determine the batch size in stochastic gradient descent methods. The choice of the batch size induces a trade-off between the accuracy of the gradient estimate and the cost in terms of samples of each update. We propose to determine the batch size by optimizing the ratio between a lower bound to a linear or quadratic Taylor approximatio...
Alopex is a correlation-based gradient-free optimization technique useful in many learning problems. However, there are no analytical results on the asymptotic behavior of this algorithm. This article presents a new version of Alopex that can be analyzed using techniques of two timescale stochastic approximation method. It is shown that the algorithm asymptotically behaves like a gradient-desce...
The stochastic composition optimization proposed recently by Wang et al. [2014] minimizes the objective with the compositional expectation form: minx (EiFi ◦ EjGj)(x). It summarizes many important applications in machine learning, statistics, and finance. In this paper, we consider the finite-sum scenario for composition optimization: min x f (x) := 1 n n ∑ i=1 Fi ( 1 m m ∑ j=1 Gj(x) ) . We pro...
This work studies constrained stochastic optimization problems where the objective and constraint functions are convex expressed as compositions of functions. The problem arises in context fair classification, regression, design queuing systems. Of particular interest is large-scale setting an oracle provides gradients constituent functions, goal to solve with a minimal number calls oracle. Owi...
We consider a decentralized learning setting in which data is distributed over nodes graph. The goal to learn global model on the without involving any central entity that needs be trusted. While gossip-based stochastic gradient descent (SGD) can used achieve this objective, it incurs high communication and computation costs, since has wait for all local models at converge. To speed up converge...
1) There exists a study on employing mini-batch approach on SVRG, one of the VR methods. It shows that the approach cannot scale well that there is no significant difference between using 16 threads and more[2]. This study observes the cause of the poor scalability of this existing mini-batch approach on VR method. 2) The performance of mini-batch approach on distributed setting is improved by ...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید