Stacked RNNs for Encoder-Decoder Networks: Accurate Machine Understanding of Images
نویسنده
چکیده
We address the image captioning task by combining a convolutional neural network (CNN) with various recurrent neural network architectures. We train the models on over 400,000 training examples ( roughly 80,000 images, with 5 captions per image) from the Microsoft 2014 COCO challenge. We demonstrate that stacking a 2-Layer RNN provides better results on image captioning tasks than both a Vanilla LSTM and a Vanilla RNN.
منابع مشابه
Asynchronous Bidirectional Decoding for Neural Machine Translation
The dominant neural machine translation (NMT) models apply unified attentional encoder-decoder neural networks for translation. Traditionally, the NMT decoders adopt recurrent neural networks (RNNs) to perform translation in a left-toright manner, leaving the target-side contexts generated from right to left unexploited during translation. In this paper, we equip the conventional attentional en...
متن کاملRefining Source Representations with Relation Networks for Neural Machine Translation
Although neural machine translation (NMT) with the encoder-decoder framework has achieved great success in recent times, it still suffers from some drawbacks: RNNs tend to forget old information which is often useful and the encoder only operates through words without considering word relationship. To solve these problems, we introduce a relation networks (RN) into NMT to refine the encoding re...
متن کاملResidual Stacking of RNNs for Neural Machine Translation
To enhance Neural Machine Translation models, several obvious ways such as enlarging the hidden size of recurrent layers and stacking multiple layers of RNN can be considered. Surprisingly, we observe that using naively stacked RNNs in the decoder slows down the training and leads to degradation in performance. In this paper, We demonstrate that applying residual connections in the depth of sta...
متن کاملA New Method for Detecting Ships in Low Size and Low Contrast Marine Images: Using Deep Stacked Extreme Learning Machines
Detecting ships in marine images is an essential problem in maritime surveillance systems. Although several types of deep neural networks have almost ubiquitously used for this purpose, but the performance of such networks greatly drops when they are exposed to low size and low contrast images which have been captured by passive monitoring systems. On the other hand factors such as sea waves, c...
متن کاملSequence-to-Sequence RNNs for Text Summarization
In this work, we cast text summarization as a sequence-to-sequence problem and apply the attentional encoder-decoder RNN that has been shown to be successful for Machine Translation (Bahdanau et al. (2014)). Our experiments show that the proposed architecture significantly outperforms the state-of-the art model of Rush et al. (2015) on the Gigaword dataset without any additional tuning. We also...
متن کامل