Simplex Decompositions using SVD and PLSA

نویسندگان

  • Madhusudana V. S. Shashanka
  • Michael Giering
چکیده

Probabilistic Latent Semantic Analysis (PLSA) is a popular technique to analyze non-negative data where multinomial distributions underlying every data vector are expressed as linear combinations of a set of basis distributions. These learned basis distributions that characterize the dataset lie on the standard simplex and themselves represent corners of a simplex within which all data approximations lie. In this paper, we describe a novel method to extend the PLSA decomposition where the bases are not constrained to lie on the standard simplex and thus are better able to characterize the data. The locations of PLSA basis distributions on the standard simplex depend on how the dataset is aligned with respect to the standard simplex. If the directions of maximum variance of the dataset are orthogonal to the standard simplex, then the PLSA bases will give a poor representation of the dataset. Our approach overcomes this drawback by utilizing Singular Values Decomposition (SVD) to identify the directions of maximum variance, and transforming the dataset to align these directions parallel to the standard simplex before performing PLSA. The learned PLSA features are then transformed back into the data space. The effectiveness of the proposed approach is demonstrated with experiments on synthetic data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

RPLSA: A novel updating scheme for Probabilistic Latent Semantic Analysis

A novel updating method for Probabilistic Latent Semantic Analysis (PLSA), called Recursive PLSA (RPLSA), is proposed. The updating of conditional probabilities is derived from first principles for both the asymmetric and the symmetric PLSA formulations. The performance of RPLSA for both formulations is compared to that of the PLSA folding-in, the PLSA rerun from the breakpoint, and well-known ...

متن کامل

Multi-Level Cluster Indicator Decompositions of Matrices and Tensors

A main challenging problem for many machine learning and data mining applications is that the amount of data and features are very large, so that low-rank approximations of original data are often required for efficient computation. We propose new multi-level clustering based low-rank matrix approximations which are comparable and even more compact than Singular Value Decomposition (SVD). We ut...

متن کامل

A Comparative performance evaluation of SVD and Schur Decompositions for Image Watermarking

In this paper, the performance of SVD and Schur decomposition is evaluated and compared for image copyright protection applications. The watermark image is embedded in the cover image by using Quantization Index Modulus Modulation (QIMM) and Quantization Index Modulation (QIM). Watermark image is embedded in the D matrix of Schur decomposition and Singular Value Decomposition (SVD). Watermarkin...

متن کامل

Estimating a Few Extreme Singular Values and Vectors for Large-Scale Matrices in Tensor Train Format

We propose new algorithms for singular value decomposition (SVD) of very large-scale matrices based on a low-rank tensor approximation technique called the tensor train (TT) format. The proposed algorithms can compute several dominant singular values and corresponding singular vectors for large-scale structured matrices given in a TT format. The computational complexity of the proposed methods ...

متن کامل

Developing Tensor Operations with an Underlying Group Structure

Tensor computations frequently involve factoring or decomposing a tensor into a sum of rank-1 tensors (CANDECOMP-PARAFAC, HOSVD, etc.). These decompositions are often considered as different higher-order extensions of the matrix SVD. The HOSVD can be described using the n-mode product, which describes multiplication between a higher-order tensor and a matrix. Generalizing this multiplication le...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012