Revisiting NARX Recurrent Neural Networks for Long-Term Dependencies

نویسندگان

  • Robert S. DiPietro
  • Nassir Navab
  • Gregory D. Hager
چکیده

Recurrent neural networks (RNNs) have shown success for many sequence-modeling tasks, but learning long-term dependencies from data remains difficult. This is often attributed to the vanishing gradient problem, which shows that gradient components relating a loss at time t to time t− τ tend to decay exponentially with τ . Long short-term memory (LSTM) and gated recurrent units (GRUs), the most widely-used RNN architectures, attempt to remedy this problem by making the decay’s base closer to 1. NARX RNNs1 take an orthogonal approach: by including direct connections, or delays, from the past, NARX RNNs make the decay’s exponent closer to 0. However, as introduced, NARX RNNs reduce the decay’s exponent only by a factor of nd, the number of delays, and simultaneously increase computation by this same factor. We introduce a new variant of NARX RNNs, called MIxed hiSTory RNNs, which addresses these drawbacks. We show that for τ ≤ 2nd−1, MIST RNNs reduce the decay’s worst-case exponent from τ/nd to log τ , while maintaining computational complexity that is similar to LSTM and GRUs. We compare MIST RNNs to simple RNNs, LSTM, and GRUs across 4 diverse tasks. MIST RNNs outperform all other methods in 2 cases, and in all cases are competitive.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

How embedded memory in recurrent neural network architectures helps learning long-term temporal dependencies

Learning long-term temporal dependencies with recurrent neural networks can be a difficult problem. It has recently been shown that a class of recurrent neural networks called NARX networks perform much better than conventional recurrent neural networks for learning certain simple long-term dependency problems. The intuitive explanation for this behavior is that the output memories of a NARX ne...

متن کامل

Learning long-term dependencies in NARX recurrent neural networks

It has previously been shown that gradient-descent learning algorithms for recurrent neural networks can perform poorly on tasks that involve long-term dependencies, i.e. those problems for which the desired output depends on inputs presented at times far in the past. We show that the long-term dependencies problem is lessened for a class of architectures called nonlinear autoregressive models ...

متن کامل

Learning Long{term Dependencies Is Not as Diicult with Narx Recurrent Neural Networks

It has recently been shown that gradient descent learning algorithms for recurrent neural networks can perform poorly on tasks that involve long{term dependencies, i.e. those problems for which the desired output depends on inputs presented at times far in the past. In this paper we explore the long{term dependencies problem for a class of architectures called NARX recurrent neural networks, wh...

متن کامل

Learning Long-Term Dependencies in NARX Recurrent Neural Networks - Neural Networks, IEEE Transactions on

It has recently been shown that gradient-descent learning algorithms for recurrent neural networks can perform poorly on tasks that involve long-term dependencies, i.e., those problems for which the desired output depends on inputs presented at times far in the past. We show that the long-term dependencies problem is lessened for a class of architectures called Nonlinear AutoRegressive models w...

متن کامل

Pii: S0893-6080(98)00018-5

Learning long-term temporal dependencies with recurrent neural networks can be a difficult problem. It has recently been shown that a class of recurrent neural networks called NARX networks perform much better than conventional recurrent neural networks for learning certain simple long-term dependency problems. The intuitive explanation for this behavior is that the output memories of a NARX ne...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1702.07805  شماره 

صفحات  -

تاریخ انتشار 2017