Natural Gradient Works Eciently in Learning
نویسنده
چکیده
When a parameter space has a certain underlying structure, the ordinary gradient of a function does not represent its steepest direction but the natural gradient does. Information geometry is used for calculating the natural gradients in the parameter space of perceptrons, the space of matrices (for blind source separation) and the space of linear dynamical systems (for blind source deconvolution). The dynamical behavior of natural gradient on-line learning is analyzed and is proved to be Fisher e cient, implying that it has asymptotically the same performance as the optimal batch estimation of parameters. This suggests that the plateau phenomenon which appears in the backpropagation learning algorithm of multilayer perceptrons might disappear or might be not so serious when the natural gradient is used. An adaptive method of updating the learning rate is proposed and analyzed.
منابع مشابه
Text-Based Information Retrieval Using Exponentiated Gradient Descent
The following investigates the use of single-neuron learning algorithms to improve the performance of text-retrieval systems that accept natural-language queries. A retrieval process is explained that transforms the natural-language query into the query syntax of a real retrieval system: the initial query is expanded using statistical and learning techniques and is then used for document rankin...
متن کاملToo Much Information Can Be Too Much for Learning Eciently
In designing learning algorithms it seems quite reasonable to construct them in a way such that all data the algorithm already has obtained are correctly and completely re ected in the description the algorithm outputs on these data. However, this approach may totally fail, i.e., it may lead to the unsolvability of the learning problem, or it may exclude any e cient solution of it. In particula...
متن کاملIdentification of Multiple Input-multiple Output Non-linear System Cement Rotary Kiln using Stochastic Gradient-based Rough-neural Network
Because of the existing interactions among the variables of a multiple input-multiple output (MIMO) nonlinear system, its identification is a difficult task, particularly in the presence of uncertainties. Cement rotary kiln (CRK) is a MIMO nonlinear system in the cement factory with a complicated mechanism and uncertain disturbances. The identification of CRK is very important for different pur...
متن کاملAdaptive Method of Realizing
The natural gradient learning method is known to have ideal performances for on-line training of multilayer perceptrons. It avoids plateaus which give rise to slow convergence of the backpropagation method. It is Fisher eecient whereas the conventional method is not. However, for implementing the method, it is necessary to calculate the Fisher information matrix and its inverse, which is practi...
متن کاملAn alternative switching criterion for independent component analysis (ICA)
In solving the problem of noiseless independent component analysis (ICA) in which sources of superand sub-Gaussian coexist in an unknown manner, one can be lead to a feasible solution using the natural gradient learning algorithm with a kind of switching criterion for the model probability distribution densities to be selected as superor sub-Gaussians appropriately during the iterations. In thi...
متن کامل