Capturing Nonlinear Structure in Word Spaces through Dimensionality Reduction
نویسندگان
چکیده
Dimensionality reduction has been shown to improve processing and information extraction from high dimensional data. Word space algorithms typically employ linear reduction techniques that assume the space is Euclidean. We investigate the effects of extracting nonlinear structure in the word space using Locality Preserving Projections, a reduction algorithm that performs manifold learning. We apply this reduction to two common word space models and show improved performance over the original models on benchmarks.
منابع مشابه
Image Spaces and Video Trajectories: Using Isomap to Explore Video Sequences
Dimensionality reduction techniques seek to represent a set of images as a set of points in a low dimensional space. Here we explore a video representation that considers a video as two parts – a space of possible images and a trajectory through that space. The nonlinear dimensionality reduction technique of Isomap, gives, for many interesting scenes, a very low dimensional representation of th...
متن کاملRegularized Orthogonal Local Fisher Discriminant Analysis
Aiming at deficiencies of the ability for preserving local nonlinear structure of recently proposed Regularized Orthogonal Linear Discriminant Analysis (ROLDA) for dimensionality reduction, a kind of dimensionality reduction algorithm named Regularized Orthogonal Local Fisher Discriminant Analysis (ROLFDA) is proposed in the paper, which is originated from ROLDA. The algorithm introduce the ide...
متن کاملVisualising kernel spaces
Classification in kernel machines consists of a nonlinear transformation of input data into a feature space, followed by a separation with a linear hyperplane. This transformation is expressed through a kernel function, which is capable of computing similarities between two data points in an abstract geometric space for which individual point vectors are computationally intractable. In this pap...
متن کاملLearning Curved Manifolds The World is not always Flat or Learning Curved Manifolds
Manifold learning and finding low-dimensional structure in data is an important task. Many algorithms for this purpose embed data in Euclidean space, an approach which is destined to fail on non-flat data. This paper presents a non-iterative algebraic method for embedding the data into hyperbolic and spherical spaces. We argue that these spaces are often better than Euclidean space in capturing...
متن کاملWord, graph and manifold embedding from Markov processes Author=Tatsunori Hashimoto, David Alvarez-Melis, Tommi S. Jaakkola
Continuous vector representations of words and objects appear to carry surprisingly rich semantic content. In this paper, we advance both the conceptual and theoretical understanding of word embeddings in three ways. First, we ground embeddings in semantic spaces studied in cognitivepsychometric literature and introduce new evaluation tasks. Second, in contrast to prior work, we take metric rec...
متن کامل