Training Recurrent Networks Using the Extended Kalman Filter
نویسنده
چکیده
The extended Kalman lter (EKF) can be used as an on-line algorithm to determine the weights in a recurrent network given target outputs as it runs. This paper notes some relationships between the EKF as applied to recurrent net learning and some simpler techniques that are more widely used. In particular, making certain simpliications to the EKF gives rise to an algorithm essentially identical to the real-time recurrent learning (RTRL) algorithm. Since the EKF involves adjusting unit activity in the network, it also provides a principled generalization of the teacher forcing technique. Prelinary simulation experiments on simple nite-state Boolean tasks indicate that the EKF can provide substantial speed-up in number of time steps required for training on such problems when compared with simpler on-line gradient algorithms. The computational requirements of the EKF are steep, but turn out to scale with network size at the same rate as RTRL. These observations are intended to provide insights that may lead to recurrent net training techniques that allow better control over the tradeoo between computational cost and convergence time.
منابع مشابه
Design of Instrumentation Sensor Networks for Non-Linear Dynamic Processes Using Extended Kalman Filter
This paper presents a methodology for design of instrumentation sensor networks in non-linear chemical plants. The method utilizes a robust extended Kalman filter approach to provide an efficient dynamic data reconciliation. A weighted objective function has been introduced to enable the designer to incorporate each individual process variable with its own operational importance. To enhance...
متن کاملComplex Extended Kalman Filters for Training Recurrent Neural Network Channel Equalizers
The Kalman filter was named after Rudolph E. Kalman published in 1960 his famous paper (Kalman, 1960) describing a recursive solution to the discrete-data linear filtering problem. There are several tutorial papers and books dealing with the subject for a great variety of applications in many areas from engineering to finance (Grewal & Andrews, 2001; Sorenson, 1970; Haykin, 2001; Bar-Shalom & L...
متن کاملSensorless Speed Control of Double Star Induction Machine With Five Level DTC Exploiting Neural Network and Extended Kalman Filter
This article presents a sensorless five level DTC control based on neural networks using Extended Kalman Filter (EKF) applied to Double Star Induction Machine (DSIM). The application of the DTC control brings a very interesting solution to the problems of robustness and dynamics. However, this control has some drawbacks such as the uncontrolled of the switching frequency and the strong ripple t...
متن کاملDirect Method for Training Feed-Forward Neural Networks Using Batch Extended Kalman Filter for Multi-Step-Ahead Predictions
This paper is dedicated to the long-term, or multi-step-ahead, time series prediction problem. We propose a novel method for training feed-forward neural networks, such as multilayer perceptrons, with tapped delay lines. Special batch calculation of derivatives called Forecasted Propagation Through Time and batch modification of the Extended Kalman Filter are introduced. Experiments were carrie...
متن کاملPrediction of Dynamical Systems by Recurrent Neural Networks
Recurrent neural networks in general achieve better results in prediction of time series then feedforward networks. Echo state neural networks seem to be one alternative to them. I have shown on the task of text correction, that they achieve slightly better results compared to already known method based on Markov model. The major part of this work is focused on alternatives to recurrent neural ...
متن کاملRecurrent Neural Network Training with the Extended Kalman Filter
Recurrent neural networks, in contrast to the classical feedforward neural networks, better handle inputs that have spacetime structure, e.g. symbolic time series. Since the classic gradient methods for recurrent neural network training on longer input sequences converge very poorly and slowly, the alternative approaches are needed. We describe the principles of the training method with the Ext...
متن کامل