Blending LSTMs into CNNs
نویسندگان
چکیده
We show that a deep convolutional network with an architecture inspired by the models used in image recognition can yield accuracy similar to a long-short term memory (LSTM) network, which achieves the state-of-the-art performance on the standard Switchboard automatic speech recognition task. Moreover, we demonstrate that merging the knowledge in the CNN and LSTM models via model compression further improves the accuracy of the convolutional model.
منابع مشابه
BB_twtr at SemEval-2017 Task 4: Twitter Sentiment Analysis with CNNs and LSTMs
In this paper we describe our attempt at producing a state-of-the-art Twitter sentiment classifier using Convolutional Neural Networks (CNNs) and Long Short Term Memory (LSTMs) networks. Our system leverages a large amount of unlabeled data to pre-train word embeddings. We then use a subset of the unlabeled data to fine tune the embeddings using distant supervision. The final CNNs and LSTMs are...
متن کاملFast and Accurate Entity Recognition with Iterated Dilated Convolutions
Today when many practitioners run basic NLP on the entire web and large-volume traffic, faster methods are paramount to saving time and energy costs. Recent advances in GPU hardware have led to the emergence of bi-directional LSTMs as a standard method for obtaining pertoken vector representations serving as input to labeling tasks such as NER (often followed by prediction in a linear-chain CRF...
متن کاملIntroduction to CNNs and LSTMs for NLP
I put together these notes as part of my TA work for the Graph and Text Mining grad course of Prof. Michalis Vazirgiannis in the Spring of 2017. They accompanied a programming lab session about Convolutional Neural Networks (CNNs) and Long Short Term Memory networks (LSTMs) for document classification, using Python and Keras1. Keras is a very popular Python library for deep learning. It is a wr...
متن کاملFast and Accurate Sequence Labeling with Iterated Dilated Convolutions
Today when many practitioners run basic NLP on the entire web and large-volume traffic, faster methods are paramount to saving time and energy costs. Recent advances in GPU hardware have led to the emergence of bi-directional LSTMs as a standard method for obtaining pertoken vector representations serving as input to labeling tasks such as NER (often followed by prediction in a linear-chain CRF...
متن کاملModeling Time-Frequency Patterns with LSTM vs. Convolutional Architectures for LVCSR Tasks
Various neural network architectures have been proposed in the literature to model 2D correlations in the input signal, including convolutional layers, frequency LSTMs and 2D LSTMs such as time-frequency LSTMs, grid LSTMs and ReNet LSTMs. It has been argued that frequency LSTMs can model translational variations similar to CNNs, and 2D LSTMs can model even more variations [1], but no proper com...
متن کامل