Fast Nonnegative Matrix Tri-Factorization for Large-Scale Data Co-Clustering
نویسندگان
چکیده
NonnegativeMatrix Factorization (NMF) based coclustering methods have attracted increasing attention in recent years because of their mathematical elegance and encouraging empirical results. However, the algorithms to solve NMF problems usually involve intensive matrix multiplications, which make them computationally inefficient. In this paper, instead of constraining the factor matrices of NMF to be nonnegative as existing methods, we propose a novel Fast Nonnegative Matrix Trifactorization (FNMTF) approach to constrain them to be cluster indicator matrices, a special type of nonnegative matrices. As a result, the optimization problem of our approach can be decoupled, which results in much smaller size subproblems requiring much less matrix multiplications, such that our approach works well for large-scale input data. Moreover, the resulted factor matrices can directly assign cluster labels to data points and features due to the nature of indicator matrices. In addition, through exploiting the manifold structures in both data and feature spaces, we further introduce the Locality Preserved FNMTF (LP-FNMTF) approach, by which the clustering performance is improved. The promising results in extensive experimental evaluations validate the effectiveness of the proposed methods.
منابع مشابه
Fast Robust Non-Negative Matrix Factorization for Large-Scale Human Action Data Clustering
Human action recognition is important in improving human life in various aspects. However, the outliers and noise in data often bother the clustering tasks. Therefore, there is a great need for the robust data clustering techniques. Nonnegative matrix factorization (NMF) and Nonnegative Matrix Tri-Factorization (NMTF) methods have been widely researched these years and applied to many data clus...
متن کاملA Projected Alternating Least square Approach for Computation of Nonnegative Matrix Factorization
Nonnegative matrix factorization (NMF) is a common method in data mining that have been used in different applications as a dimension reduction, classification or clustering method. Methods in alternating least square (ALS) approach usually used to solve this non-convex minimization problem. At each step of ALS algorithms two convex least square problems should be solved, which causes high com...
متن کاملOrthogonal Nonnegative Matrix Factorization for Multi-type Relational Clustering
Relational clustering with heterogeneous data objects has impact in various important applications, such as web mining, text mining and bioinformatics etc. In this paper, we build a star-structured general model for relational clustering. It is formulated as an orthogonal tri-nonnegative matrix factorization. The model performs matrix approximation among all different data types to look for hid...
متن کاملOn Trivial Solution and Scale Transfer Problems in Graph Regularized NMF
Combining graph regularization with nonnegative matrix (tri-)factorization (NMF) has shown great performance improvement compared with traditional nonnegativematrix (tri-)factorizationmodels due to its ability to utilize the geometric structure of the documents and words. In this paper, we show that these models are not well-defined and suffering from trivial solution and scale transfer problem...
متن کاملOrthogonal Nonnegative Matrix Tri-factorization for Semi-supervised Document Co-clustering
Semi-supervised clustering is often viewed as using labeled data to aid the clustering process. However, existing algorithms fail to consider dual constraints between data points (e.g. documents) and features (e.g. words). To address this problem, in this paper, we propose a novel semi-supervised document co-clustering model OSS-NMF via orthogonal nonnegative matrix tri-factorization. Our model...
متن کامل