نتایج جستجو برای: gradient descent algorithm
تعداد نتایج: 869527 فیلتر نتایج به سال:
We analyze and compare the well-known Gradient Descent algorithm and a new algorithm, called the Exponentiated Gradient algorithm, for training a single neuron with an arbitrary transfer function . Both algorithms are easily generalized to larger neural networks, and the generalization of Gradient Descent is the standard back-propagation algorithm. In this paper we prove worstcase loss bounds f...
We examine relations between popular variational methods in image processing and classical operator splitting methods in convex analysis. We focus on a gradient descent reprojection algorithm for image denoising and the recently proposed Split Bregman and alternating Split Bregman methods. By identifying the latter with the so-called DouglasRachford splitting algorithm we can guarantee its conv...
With respect to importance of the conjugate gradient methods for large-scale optimization, in this study a descent three-term conjugate gradient method is proposed based on an extended modified secant condition. In the proposed method, objective function values are used in addition to the gradient information. Also, it is established that the method is globally convergent without convexity assu...
In theory, the successive gradients generated by the conjugate gradient method applied to a quadratic should be orthogonal. However, for some ill-conditioned problems, orthogonality is quickly lost due to rounding errors, and convergence is much slower than expected. A limited memory version of the nonlinear conjugate gradient method is developed. The memory is used to both detect the loss of o...
Feed-forward neural networks are commonly used for pattern classification. The classification accuracy of feed-forward neural networks depends on the configuration selected and the training process. Once the architecture of the network is decided, training algorithms, usually gradient descent techniques, are used to determine the connection weights of the feed-forward neural network. However, g...
This paper presents a method for stabilizing and robustifying the artificial neural networks trained by utilizing the gradient descent. The method proposed constructs a dynamic model of the conventional update mechanism and derives the stabilizing values of the learning rate. The stability in this context corresponds to the convergence in adjustable parameters of the neural network structure. I...
With the advent of high-throughput technologies, l1 regularized learning algorithms have attracted much attention recently. Dozens of algorithms have been proposed for fast implementation, using various advanced optimization techniques. In this paper, we demonstrate that l1 regularized learning problems can be easily solved by using gradient-descent techniques. The basic idea is to transform a ...
For large scale learning problems, it is desirable if we can obtain the optimal model parameters by going through the data in only one pass. Polyak and Juditsky (1992) showed that asymptotically the test performance of the simple average of the parameters obtained by stochastic gradient descent (SGD) is as good as that of the parameters which minimize the empirical cost. However, to our knowled...
In this paper, we propose to study the problem of heterogeneous transfer ranking, a transfer learning problem with heterogeneous features in order to utilize the rich large-scale labeled data in popular languages to help the ranking task in less popular languages. We develop a large-margin algorithm, namely LM-HTR, to solve the problem by mapping the input features in both the source domain and...
With a weighting scheme proportional to t, a traditional stochastic gradient descent (SGD) algorithm achieves a high probability convergence rate of O(κ/T ) for strongly convex functions, instead of O(κ ln(T )/T ). We also prove that an accelerated SGD algorithm also achieves a rate of O(κ/T ).
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید