gradient descent algorithm

Training Artificial Neural Networks: Backpropagation via Nonlinear Optimization

2004

Jadranka Skorin-Kapov Wendy Tang

In this paper we explore different strategies to guide backpropagation algorithm used for training artificial neural networks. Two different variants of steepest descent-based backpropagation algorithm, and four different variants of conjugate gradient algorithm are tested. The variants differ whether or not the time component is used, and whether or not additional gradient information is utili...

متن کامل

Particle swarm optimization-based automatic parameter selection for deep neural networks and its applications in large-scale and high-dimensional data

2017

Fei Ye

In this paper, we propose a new automatic hyperparameter selection approach for determining the optimal network configuration (network structure and hyperparameters) for deep neural networks using particle swarm optimization (PSO) in combination with a steepest gradient descent algorithm. In the proposed approach, network configurations were coded as a set of real-number m-dimensional vectors a...

متن کامل

Tunneling descent for m.a.p. active contours in ultrasound segmentation

Journal: :Medical image analysis 2007

Zhong Tao Hemant D. Tagare

Active contours that evolve in ultrasound images under gradient descent are often trapped in spurious local minima. This paper presents an evolution strategy called tunneling descent, which is capable of escaping from such minima. The key idea is to evolve the contour by a sequence of constrained minimizations that move the contour in to, and out of, local minima. This strategy is an extension ...

متن کامل

Improved spontaneous dialogue recognition using dialogue and utterance triggers by adaptive probability boosting

1996

Ramesh R. Sarukkai Dana H. Ballard

Based on the observation that the unpredictable nature of conversational speech makes it almost impossible to reliably model sequential word constraints, the notion of word set error criteria is proposed for improved recognition of spontaneous dialogues. The basic idea in the TAB algorithm is to predict a set of words based on some a priori information, and perform a re-scoring pass wherein the...

متن کامل

Investigations on hessian-free optimization for cross-entropy training of deep neural networks

2013

Simon Wiesler Jinyu Li Jian Xue

Context-dependent deep neural network HMMs have been shown to achieve recognition accuracy superior to Gaussian mixture models in a number of recent works. Typically, neural networks are optimized with stochastic gradient descent. On large datasets, stochastic gradient descent improves quickly during the beginning of the optimization. But since it does not make use of second order information, ...

متن کامل

Relative loss bounds for single neurons

Journal: :IEEE transactions on neural networks 1999

David P. Helmbold Jyrki Kivinen Manfred K. Warmuth

We analyze and compare the well-known gradient descent algorithm and the more recent exponentiated gradient algorithm for training a single neuron with an arbitrary transfer function. Both algorithms are easily generalized to larger neural networks, and the generalization of gradient descent is the standard backpropagation algorithm. In this paper we prove worst-case loss bounds for both algori...

متن کامل

Proximal Algorithm Meets a Conjugate Descent

2011

Matthieu Kowalski

This paper proposes an enhancement of the non linear conjugate gradient algorithm for some non-smooth problems. We first extend some results of descent algorithms in the smooth case for convex non-smooth functions. We then construct a conjugate descent algorithm based on the proximity operator to obtain a descent direction. We finally provide a convergence analysis of this algorithm, even when ...

متن کامل

Fastest Rates for Stochastic Mirror Descent Methods

2018

Filip Hanzely Peter Richt'arik

Relative smoothness a notion introduced in [6] and recently rediscovered in [3, 18] generalizes the standard notion of smoothness typically used in the analysis of gradient type methods. In this work we are taking ideas from well studied field of stochastic convex optimization and using them in order to obtain faster algorithms for minimizing relatively smooth functions. We propose and analyze ...

متن کامل

How to Escape Saddle Points Efficiently

2017

In order to prove the main theorem, we need to show that the algorithm will not be stuck at any point that either has a large gradient or is a saddle point. This idea is similar to previous works (e.g.(Ge et al., 2015)). We first state a standard Lemma that shows if the current gradient is large, then we make progress in function value. Lemma 12. Assume f(·) satisfies A1, then for gradient desc...

متن کامل

Applying Powell's symmetrical technique to conjugate gradient methods

Journal: :Comp. Opt. and Appl. 2011

Dongyi Liu Genqi Xu

A new conjugate gradient method is proposed for applying Powell's symmetrical technique to conjugate gradient methods in this paper, which satisfies the sufficient descent property for any line search. Using Wolfe line searches, the global convergence of the method is derived from the spectral analysis of the conjugate gradient iteration matrix and Zoutendijk's condition. Based on this, two con...

متن کامل