Recurrent Neural Network Regularization
نویسندگان
چکیده
We present a simple regularization technique for Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM) units. Dropout, the most successful technique for regularizing neural networks, does not work well with RNNs and LSTMs. In this paper, we show how to correctly apply dropout to LSTMs, and show that it substantially reduces overfitting on a variety of tasks. These tasks include language modeling, speech recognition, image caption generation, and machine translation.
منابع مشابه
A constrained regularization approach for input-driven recurrent neural networks
We introduce a novel regularization approach for a class of inputdriven recurrent neural networks. The regularization of network parameters is constrained to reimplement a previously recorded state trajectory. We derive a closed-form solution for network regularization and show that the method is capable of reimplementing harvested dynamics. We investigate important properties of the method and...
متن کاملNoisin: Unbiased Regularization for Recurrent Neural Networks
Recurrent neural networks (rnns) are powerful models of sequential data. They have been successfully used in domains such as text and speech. However, rnns are susceptible to overfitting; regularization is important. In this paper we develop Noisin, a new method for regularizing rnns. Noisin injects random noise into the hidden states of the rnn and then maximizes the corresponding marginal lik...
متن کاملTikhonov Regularization for Long Short-Term Memory Networks
It is a well-known fact that adding noise to the input data often improves network performance. While the dropout technique may be a cause of memory loss, when it is applied to recurrent connections, Tikhonov regularization, which can be regarded as the training with additive noise, avoids this issue naturally, though it implies regularizer derivation for different architectures. In case of fee...
متن کاملA Recurrent Neural Network Model for solving CCR Model in Data Envelopment Analysis
In this paper, we present a recurrent neural network model for solving CCR Model in Data Envelopment Analysis (DEA). The proposed neural network model is derived from an unconstrained minimization problem. In the theoretical aspect, it is shown that the proposed neural network is stable in the sense of Lyapunov and globally convergent to the optimal solution of CCR model. The proposed model has...
متن کاملA Recurrent Neural Network to Identify Efficient Decision Making Units in Data Envelopment Analysis
In this paper we present a recurrent neural network model to recognize efficient Decision Making Units(DMUs) in Data Envelopment Analysis(DEA). The proposed neural network model is derived from an unconstrained minimization problem. In theoretical aspect, it is shown that the proposed neural network is stable in the sense of lyapunov and globally convergent. The proposed model has a single-laye...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1409.2329 شماره
صفحات -
تاریخ انتشار 2014