Capturing Nonlinear Structure in Word Spaces through Dimensionality Reduction

نویسندگان

  • David Jurgens
  • Keith Stevens
چکیده

Dimensionality reduction has been shown to improve processing and information extraction from high dimensional data. Word space algorithms typically employ linear reduction techniques that assume the space is Euclidean. We investigate the effects of extracting nonlinear structure in the word space using Locality Preserving Projections, a reduction algorithm that performs manifold learning. We apply this reduction to two common word space models and show improved performance over the original models on benchmarks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Image Spaces and Video Trajectories: Using Isomap to Explore Video Sequences

Dimensionality reduction techniques seek to represent a set of images as a set of points in a low dimensional space. Here we explore a video representation that considers a video as two parts – a space of possible images and a trajectory through that space. The nonlinear dimensionality reduction technique of Isomap, gives, for many interesting scenes, a very low dimensional representation of th...

متن کامل

Regularized Orthogonal Local Fisher Discriminant Analysis

Aiming at deficiencies of the ability for preserving local nonlinear structure of recently proposed Regularized Orthogonal Linear Discriminant Analysis (ROLDA) for dimensionality reduction, a kind of dimensionality reduction algorithm named Regularized Orthogonal Local Fisher Discriminant Analysis (ROLFDA) is proposed in the paper, which is originated from ROLDA. The algorithm introduce the ide...

متن کامل

Visualising kernel spaces

Classification in kernel machines consists of a nonlinear transformation of input data into a feature space, followed by a separation with a linear hyperplane. This transformation is expressed through a kernel function, which is capable of computing similarities between two data points in an abstract geometric space for which individual point vectors are computationally intractable. In this pap...

متن کامل

Learning Curved Manifolds The World is not always Flat or Learning Curved Manifolds

Manifold learning and finding low-dimensional structure in data is an important task. Many algorithms for this purpose embed data in Euclidean space, an approach which is destined to fail on non-flat data. This paper presents a non-iterative algebraic method for embedding the data into hyperbolic and spherical spaces. We argue that these spaces are often better than Euclidean space in capturing...

متن کامل

Word, graph and manifold embedding from Markov processes Author=Tatsunori Hashimoto, David Alvarez-Melis, Tommi S. Jaakkola

Continuous vector representations of words and objects appear to carry surprisingly rich semantic content. In this paper, we advance both the conceptual and theoretical understanding of word embeddings in three ways. First, we ground embeddings in semantic spaces studied in cognitivepsychometric literature and introduce new evaluation tasks. Second, in contrast to prior work, we take metric rec...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010