Multiplicative LSTM for sequence modelling

نویسندگان

  • Ben Krause
  • Liang Lu
  • Iain Murray
  • Steve Renals
چکیده

We introduce multiplicative LSTM (mLSTM), a novel recurrent neural network architecture for sequence modelling that combines the long short-term memory (LSTM) and multiplicative recurrent neural network architectures. mLSTM is characterised by its ability to have different recurrent transition functions for each possible input, which we argue makes it more expressive for autoregressive density estimation. We demonstrate empirically that mLSTM outperforms standard LSTM and its deep variants for a range of character level modelling tasks, and that this improvement increases with the complexity of the task. This model achieves a test error of 1.19 bits/character on the last 4 million characters of the Hutter prize dataset when combined with dynamic evaluation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimizing and Contrasting Recurrent Neural Network Architectures

Recurrent Neural Networks (RNNs) have long been recognized for their potential to model complex time series. However, it remains to be determined what optimization techniques and recurrent architectures can be used to best realize this potential. The experiments presented take a deep look into Hessian free optimization, a powerful second order optimization method that has shown promising result...

متن کامل

Understanding Gating Operations in Recurrent Neural Networks through Opinion Expression Extraction

Extracting opinion expressions from text is an essential task of sentiment analysis, which is usually treated as one of the word-level sequence labeling problems. In such problems, compositional models with multiplicative gating operations provide efficient ways to encode the contexts, as well as to choose critical information. Thus, in this paper, we adopt Long Short-Term Memory (LSTM) recurre...

متن کامل

Long Short-Term Memory

Learning to store information over extended time intervals by recurrent backpropagation takes a very long time, mostly because of insufficient, decaying error backflow. We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient-based method called long short-term memory (LSTM). Truncating the gradient where this does not do harm, ...

متن کامل

RESOLUTION METHOD FOR MIXED INTEGER LINEAR MULTIPLICATIVE-LINEAR BILEVEL PROBLEMS BASED ON DECOMPOSITION TECHNIQUE

In this paper, we propose an algorithm base on decomposition technique for solvingthe mixed integer linear multiplicative-linear bilevel problems. In actuality, this al-gorithm is an application of the algorithm given by G. K. Saharidis et al for casethat the rst level objective function is linear multiplicative. We use properties ofquasi-concave of bilevel programming problems and decompose th...

متن کامل

A Study on LSTM Networks for Polyphonic Music Sequence Modelling

Neural networks, and especially long short-term memory networks (LSTM), have become increasingly popular for sequence modelling, be it in text, speech, or music. In this paper, we investigate the predictive power of simple LSTM networks for polyphonic MIDI sequences, using an empirical approach. Such systems can then be used as a music language model which, combined with an acoustic model, can ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1609.07959  شماره 

صفحات  -

تاریخ انتشار 2016