Clustering Microarray Data
نویسندگان
چکیده
We begin with an example that will be used throughout the chapter.The data come from Sorlie et al. (2001). The goal of that article was to “classify breast carcinomas based on variations in gene expression derived from complementary deoxyribonucleic acid (cDNA) microarrays and to correlate tumor characteristics to clinical outcome.’’ The data consist of log fluorescence values for 456 cDNA clones measured on 85 tissue samples. Of the 85 samples, 4 are normal tissue samples, 78 are carcinomas, and 3 are fibroadenomas. Three of the four normal tissue samples were pooled normal breast samples from multiple individuals. Sorlie et al. (2001) selected the 456 genes from an initial set of 8102 genes so as to optimally identify the intrinsic characteristics of breast tumors. In Figures 4.1 and 4.2, the data are plotted as heat maps.∗ This representation assigns a color for every matrix entry, with negative (underexpressed) values being green, and positive (overexpressed) values red. The data presented in this plot were preprocessed by Sorlie et al. (2001), adjusting rows and columns to have median zero. This preprocessing was applied before selection of the subset of 456 genes, so the column (i.e., sample) medians are not precisely zero.
منابع مشابه
Modification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis
Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...
متن کاملبه کارگیری روشهای خوشهبندی در ریزآرایه DNA
Background: Microarray DNA technology has paved the way for investigators to expressed thousands of genes in a short time. Analysis of this big amount of raw data includes normalization, clustering and classification. The present study surveys the application of clustering technique in microarray DNA analysis. Materials and methods: We analyzed data of Van’t Veer et al study dealing with BRCA1...
متن کاملبه کارگیری خوشهبندی دوبعدی با روش «زیرماتریسهای با میانگین- درایههای بزرگ» در دادههای بیان ژنی حاصل از ریزآرایههای DNA
Background and Objective: In recent years, DNA microarray technology has become a central tool in genomic research. Using this technology, which made it possible to simultaneously analyze expression levels for thousands of genes under different conditions, massive amounts of information will be obtained. While traditional clustering methods, such as hierarchical and K-means clustering have been...
متن کاملData Complexity in Clustering Analysis of Gene Microarray Expression Profiles
The increasing application of microarray technology is generating large amounts of high dimensional gene expression data. Genes participating in the same biological process tend to have similar expression patterns, and clustering is one of the most useful and efficient methods for identifying these patterns. Due to the complexity of microarray profiles, there are some limitations in directly ap...
متن کاملFuzzy Types Clustering for Microarray Data
The main goal of microarray experiments is to quantify the expression of every object on a slide as precisely as possible, with a further goal of clustering the objects. Recently, many studies have discussed clustering issues involving similar patterns of gene expression. This paper presents an application of fuzzy-type methods for clustering DNA microarray data that can be applied to typical c...
متن کاملIEEE Paper Template in A4 (V1)
n data mining, clustering techniques have been applied in cellular processes, gene regulation, sub types of cells and gene function. Clustering in microarray gene expression handles various experimental conditions in various algorithms by using different data sets. This paper focuses the study on the clustering of gene expression data using the data sets such as yeast data, yeast cell-cycle, se...
متن کامل