Autoencoder for words

نویسندگان

Cheng-Yuan Liou

Wei-Chen Cheng

Jiun-Wei Liou

Daw-Ran Liou

چکیده

This paper presents a training method that encodes each word into a different vector in semantic space and its relation to low entropy coding. Elman network is employed in the method to process word sequences from literary works. The trained codes possess reduced entropy and are used in ranking, indexing, and categorizing literary works. A modification of the method to train the multi-vector for each polysemous word is also presented where each vector represents a different meaning of its word. These multiple vectors can accommodate several different meanings of their word. This method is applied to the stylish analyses of two Chinese novels, Dream of the Red Chamber and Romance of the Three Kingdoms. & 2014 Elsevier B.V. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Multilingual Word Representations using a Bag-of-Words Autoencoder

Recent work on learning multilingual word representations usually relies on the use of word-level alignements (e.g. infered with the help of GIZA++) between translated sentences, in order to align the word embeddings in different languages. In this workshop paper, we investigate an autoencoder model for learning multilingual word representations that does without such word-level alignements. Th...

متن کامل

Incorporating visual features into word embeddings: A bimodal autoencoder-based approach

Multimodal semantic representation is an evolving area of research in natural language processing as well as computer vision. Combining or integrating perceptual information, such as visual features, with linguistic features is recently being actively studied. This paper presents a novel bimodal autoencoder model for multimodal representation learning: the autoencoder learns in order to enhance...

متن کامل

Multi-Domain Sentiment Relevance Classification with Automatic Representation Learning

Sentiment relevance (SR) aims at identifying content that does not contribute to sentiment analysis. Previously, automatic SR classification has been studied in a limited scope, using a single domain and feature augmentation techniques that require large hand-crafted databases. In this paper, we present experiments on SR classification with automatically learned feature representations on multi...

متن کامل

Collaborative Recurrent Autoencoder: Recommend while Learning to Fill in the Blanks

Hybrid methods that utilize both content and rating information are commonly used in many recommender systems. However, most of them use either handcrafted features or the bag-of-words representation as a surrogate for the content information but they are neither effective nor natural enough. To address this problem, we develop a collaborative recurrent autoencoder (CRAE) which is a denoising r...

متن کامل

Recurrent Neural Network-Based Semantic Variational Autoencoder for Sequence-to-Sequence Learning

Sequence-to-sequence (Seq2seq) models have played an import role in the recent success of various natural language processing methods, such as machine translation, text summarization, and speech recognition. However, current Seq2seq models have trouble preserving global latent information from a long sequence of words. Variational autoencoder (VAE) alleviates this problem by learning a continuo...

متن کامل

Towards Representation Learning for Biomedical Concept Detection in Medical Images: UA.PT Bioinformatics in ImageCLEF 2017

Representation learning is a field that has rapidly evolved during the last decade, with much of this progress being driven by the latest breakthroughs in deep learning. Digital medical imaging is a particularly interesting application since representation learning may enable better medical decision support systems. ImageCLEFcaption focuses on automatic information extraction from biomedical im...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Neurocomputing

دوره 139 شماره

صفحات -

تاریخ انتشار 2014

Autoencoder for words

نویسندگان

چکیده

منابع مشابه

Learning Multilingual Word Representations using a Bag-of-Words Autoencoder

Incorporating visual features into word embeddings: A bimodal autoencoder-based approach

Multi-Domain Sentiment Relevance Classification with Automatic Representation Learning

Collaborative Recurrent Autoencoder: Recommend while Learning to Fill in the Blanks

Recurrent Neural Network-Based Semantic Variational Autoencoder for Sequence-to-Sequence Learning

Towards Representation Learning for Biomedical Concept Detection in Medical Images: UA.PT Bioinformatics in ImageCLEF 2017

عنوان ژورنال:

اشتراک گذاری