نتایج جستجو برای: gradient descent algorithm

تعداد نتایج: 869527  

Journal: :Journal of Computational and Graphical Statistics 2007

Journal: :Neural Computation 2002
P. S. Sastry M. Magesh K. P. Unnikrishnan

Alopex is a correlation-based gradient-free optimization technique useful in many learning problems. However, there are no analytical results on the asymptotic behavior of this algorithm. This article presents a new version of Alopex that can be analyzed using techniques of two timescale stochastic approximation method. It is shown that the algorithm asymptotically behaves like a gradient-desce...

2005
Nikhil Bobb David Helmbold Philip Zigoris

Although boosting methods have become an extremely important classification method, there has been little attention paid to boosting with asymmetric losses. In this paper we take a gradient descent view of boosting in order to motivate a new boosting variant called BiBoost which treats the two classes differently. This variant is likely to perform well when there is a different cost for false p...

Journal: :CoRR 2017
Yann Ollivier

We introduce a simple algorithm, True Asymptotic Natural Gradient Optimization (TANGO), that converges to a true natural gradient descent in the limit of small learning rates, without explicit Fisher matrix estimation. For quadratic models the algorithm is also an instance of averaged stochastic gradient, where the parameter is a moving average of a “fast”, constant-rate gradient descent. TANGO...

Journal: :CoRR 2017
Penghang Yin Minh Pham Adam M. Oberman Stanley Osher

In this paper, we propose an implicit gradient descent algorithm for the classic k-means problem. The implicit gradient step or backward Euler is solved via stochastic fixed-point iteration, in which we randomly sample a mini-batch gradient in every iteration. It is the average of the fixed-point trajectory that is carried over to the next gradient step. We draw connections between the proposed...

2016
Onkar Bhardwaj Guojing Cong

Stochastic Gradient Descent (SGD) and its variants are the most important optimization algorithms used in large scale machine learning. Mini-batch version of stochastic gradient is often used in practice for taking advantage of hardware parallelism. In this work, we analyze the effect of mini-batch size over SGD convergence for the case of general non-convex objective functions. Building on the...

2016
Jeff Daily Abhinav Vishnu Charles Siegel

In this paper, we present multiple approaches for improving the performance of gradient descent when utilizing mutiple compute resources. The proposed approaches span a solution space ranging from equivalence to running on a single compute device to delaying gradient updates a fixed number of times. We present a new approach, asynchronous layer-wise gradient descent that maximizes overlap of la...

2011
Muqeet Ali Christopher C. Johnson Alex K. Tang

We present a distributed stochastic gradient descent algorithm for performing low-rank matrix factorization on streaming data. Low-rank matrix factorization is often used as a technique for collaborative filtering. As opposed to recent algorithms that perform matrix factorization in parallel on a batch of training examples [4], our algorithm operates on a stream of incoming examples. We experim...

Journal: :CoRR 2016
Sashank J. Reddi Suvrit Sra Barnabás Póczos Alexander J. Smola

We analyze a fast incremental aggregated gradient method for optimizing nonconvex problems of the form minx ∑ i fi(x). Specifically, we analyze the Saga algorithm within an Incremental First-order Oracle framework, and show that it converges to a stationary point provably faster than both gradient descent and stochastic gradient descent. We also discuss a Polyak’s special class of nonconvex pro...

2018
Yoonho Lee Seungjin Choi

Gradient-based meta-learning has been shown to be expressive enough to approximate any learning algorithm. While previous such methods have been successful in meta-learning tasks, they resort to simple gradient descent during meta-testing. Our primary contribution is the MT-net, which enables the meta-learner to learn on each layer’s activation space a subspace that the task-specific learner pe...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید