Constrained Subspace Clustering for Time Series Gene Expression Data

نویسندگان

  • Jibin Qu
  • Michael Ng
  • Luonan Chen
چکیده

For time series gene expression data, it is an important problem to find subgroups of genes with similar expression pattern in a consecutive time window. In this paper, we extend a fuzzy c-means clustering algorithm to construct two models to detect biclusters respectively, i.e., constant value biclusters and similarity-based biclusters whose gene expression profiles are similar within consecutive time points. Finally, we verify our methods on several artificial datasets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Repeated Record Ordering for Constrained Size Clustering

One of the main techniques used in data mining is data clustering, which has many applications in computer science, biology, and social sciences. Constrained clustering is a type of clustering in which side information provided by the user is incorporated into current clustering algorithms. One of the well researched constrained clustering algorithms is called microaggregation. In a microaggreg...

متن کامل

Title: Subspace Clustering of Microarray Data based on Domain Transformation

We propose a mining framework that supports the identification of useful knowledge based on data clustering. With the recent advancement of microarray technologies, we focus our attention on gene expression datasets mining. In particular, given that genes are often coexpressed under subsets of experimental conditions, we present a novel algorithm on subspace clustering. In contrast to previous ...

متن کامل

Modification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis

Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...

متن کامل

Subspace Clustering of Microarray Data Based on Domain Transformation

We propose a mining framework that supports the identification of useful knowledge based on data clustering. With the recent advancement of microarray technologies, we focus our attention on gene expression datasets mining. In particular, given that genes are often coexpressed under subsets of experimental conditions, we present a novel subspace clustering algorithm. In contrast to previous app...

متن کامل

Topographic Independent Component Analysis of Gene Expression Time Series Data

Topographic independent component analysis (TICA) is an interesting extension of the conventional ICA, which aims at finding a linear decomposition into approximately independent components with the dependence between two components is approximated by their proximity in the topographic representation. In this paper we apply the topographic ICA to gene expression time series data and compare it ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010