Squeezing bottlenecks: Exploring the limits of autoencoder semantic representation capabilities

نویسندگان

  • Parth Gupta
  • Rafael E. Banchs
  • Paolo Rosso
چکیده

We present a comprehensive study on the use of autoencoders for modelling text data, in which (differently from previous studies) we focus our attention on the following issues: i) we explore the suitability of two different models bDA and rsDA for constructing deep autoencoders for text data at the sentence level; ii) we propose and evaluate two novel metrics for better assessing the text-reconstruction capabilities of autoencoders; and iii) we propose an automatic method to find the critical bottleneck dimensionality for text language representations (below which structural information is lost).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Joint Semantic Vector Representation Model for Text Clustering and Classification

Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...

متن کامل

Incorporating visual features into word embeddings: A bimodal autoencoder-based approach

Multimodal semantic representation is an evolving area of research in natural language processing as well as computer vision. Combining or integrating perceptual information, such as visual features, with linguistic features is recently being actively studied. This paper presents a novel bimodal autoencoder model for multimodal representation learning: the autoencoder learns in order to enhance...

متن کامل

Exploring a Mixed Representation for Encoding Temporal Coherence

Guiding representation learning towards temporally stable features improves object identity encoding from video. Existing models have applied temporal coherence uniformly over all features based on the assumption that optimal object identity encoding only requires temporally stable components. We explore the effects of mixing temporally coherent invariant features alongside variable features in...

متن کامل

Exploring the basic elements of successful knowledge management system with presenting a theory through a semantic network

Abstract: Nowadays knowledge is recognized as an important enabler for competitive advantages and many companies are beginning to establish knowledge management systems. Within the last few years many organizations tried to design a suitable knowledge management system and many of them were successful. This paper is to discover critical success factors (CSF) of knowledge management (KM) and the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Neurocomputing

دوره 175  شماره 

صفحات  -

تاریخ انتشار 2016