Exemplar-based Robust Coherent Biclustering
نویسندگان
چکیده
The biclustering, co-clustering, or subspace clustering problem involves simultaneously grouping the rows and columns of a data matrix to uncover biclusters or sub-matrices of the data matrix that optimize a desired objective function. In coherent biclustering, the objective function contains a coherence measure of the biclusters. We introduce a novel formulation of the coherent biclustering problem and use it to derive two algorithms. The first algorithm is based on loopy message passing; and the second relies on a greedy strategy yielding an algorithm that is significantly faster than the first. A distinguishing feature of these algorithms is that they identify an exemplar or a prototypical member of each bi-cluster. We note the interference from background elements in bi-clustering, and offer a means to circumvent such interference using additional regularization. Our experiments with synthetic as well as real-world datasets show that our algorithms are competitive with the current stateof-the-art algorithms for finding coherent bi-clusters.
منابع مشابه
GFBA: A Biclustering Algorithm for Discovering Value-Coherent Biclusters
Clustering has been one of the most popular approaches used in gene expression data analysis. A clustering method is typically used to partition genes according to their similarity of expression under different conditions. However, it is often the case that some genes behave similarly only on a subset of conditions and their behavior is uncorrelated over the rest of the conditions. As tradition...
متن کاملExtending the definition of beta-consistent biclustering for feature selection
Consistent biclusterings of sets of data are useful for solving feature selection and classification problems. The problem of finding a consistent biclustering can be formulated as a combinatorial optimization problem, and it can be solved by the employment of a recently proposed VNS-based heuristic. In this context, the concept of β-consistent biclustering has been introduced for dealing with ...
متن کاملA Binary Factor Graph Model for Biclustering
Biclustering, which can be defined as the simultaneous clustering of rows and columns in a data matrix, has received increasing attention in recent years, particularly in the field of Bioinformatics (e.g. for the analysis of microarray data). This paper proposes a novel biclustering approach, which extends the Affinity Propagation [1] clustering algorithm to the biclustering case. In particular...
متن کاملEvolutionary Biclustering of Clickstream Data
Biclustering is a two way clustering approach involving simultaneous clustering along two dimensions of the data matrix. Finding biclusters of web objects (i.e. web users and web pages) is an emerging topic in the context of web usage mining. It overcomes the problem associated with traditional clustering methods by allowing automatic discovery of browsing pattern based on a subset of attribute...
متن کاملDNA Microarray Data Analysis: A Novel Biclustering Algorithm Approach
Biclustering algorithms refer to a distinct class of clustering algorithms that perform simultaneous row-column clustering. Biclustering problems arise in DNAmicroarray data analysis, collaborative filtering, market research, information retrieval, text mining, electoral trends, exchange analysis, and so forth. When dealing with DNA microarray experimental data for example, the goal of bicluste...
متن کامل