Nonlinear Dimensionality Reduction as Information Retrieval

نویسندگان

  • Jarkko Venna
  • Samuel Kaski
چکیده

Nonlinear dimensionality reduction has so far been treated either as a data representation problem or as a search for a lowerdimensional manifold embedded in the data space. A main application for both is in information visualization, to make visible the neighborhood or proximity relationships in the data, but neither approach has been designed to optimize this task. We give such visualization a new conceptualization as an information retrieval problem; a projection is good if neighbors of data points can be retrieved well based on the visualized projected points. This makes it possible to rigorously quantify goodness in terms of precision and recall. A method is introduced to optimize retrieval quality; it turns out to be an extension of Stochastic Neighbor Embedding, one of the earlier nonlinear projection methods, for which we give a new interpretation: it optimizes recall. The new method is shown empirically to outperform existing dimensionality reduction methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Information Retrieval Perspective to Nonlinear Dimensionality Reduction for Data Visualization

Nonlinear dimensionality reduction methods are often used to visualize high-dimensional data, although the existing methods have been designed for other related tasks such as manifold learning. It has been difficult to assess the quality of visualizations since the task has not been well-defined. We give a rigorous definition for a specific visualization task, resulting in quantifiable goodness...

متن کامل

Fisher information embedding for video indexing and retrieval

In this paper, we present a novel information embedding based approach for video indexing and retrieval. The high dimensionality for video sequences still poses a major challenge of video indexing and retrieval. Different from the traditional dimensionality reduction techniques such as Principal Component Analysis (PCA), we embed the video data into a low dimensional statistical manifold obtain...

متن کامل

Nonlinear dimensionality reduction viewed as information retrieval

Nonlinear dimensionality reduction methods are commonly used for two purposes: (i) as preprocessing methods to reduce the number of input variables or to represent the inputs in terms of more natural variables describing the embedded data manifold, or (ii) for making the data set more understandable, by making the similarity relationships between data points explicit through visualizations. The...

متن کامل

Neighborhood Preserving Projections (NPP): A Novel Linear Dimension Reduction Method

Dimension reduction is a crucial step for pattern recognition and information retrieval tasks to overcome the curse of dimensionality. In this paper a novel unsupervised linear dimension reduction method, Neighborhood Preserving Projections (NPP), is proposed. In contrast to traditional linear dimension reduction method, such as principal component analysis (PCA), the proposed method has good n...

متن کامل

Adaptive nonlinear manifolds and their applications to pattern recognition

Dimensionality reduction has long been associated with retinotopic mapping for understanding cortical maps. Multisensory information is processed, fused and mapped to an essentially 2-D cortex in an information preserving manner. Data processing and projection techniques inspired by this biological mechanism are playing an increasingly important role in pattern recognition, computational intell...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007