Discovering Biological Progression Underlying Microarray Samples
نویسندگان
چکیده
In biological systems that undergo processes such as differentiation, a clear concept of progression exists. We present a novel computational approach, called Sample Progression Discovery (SPD), to discover patterns of biological progression underlying microarray gene expression data. SPD assumes that individual samples of a microarray dataset are related by an unknown biological process (i.e., differentiation, development, cell cycle, disease progression), and that each sample represents one unknown point along the progression of that process. SPD aims to organize the samples in a manner that reveals the underlying progression and to simultaneously identify subsets of genes that are responsible for that progression. We demonstrate the performance of SPD on a variety of microarray datasets that were generated by sampling a biological process at different points along its progression, without providing SPD any information of the underlying process. When applied to a cell cycle time series microarray dataset, SPD was not provided any prior knowledge of samples' time order or of which genes are cell-cycle regulated, yet SPD recovered the correct time order and identified many genes that have been associated with the cell cycle. When applied to B-cell differentiation data, SPD recovered the correct order of stages of normal B-cell differentiation and the linkage between preB-ALL tumor cells with their cell origin preB. When applied to mouse embryonic stem cell differentiation data, SPD uncovered a landscape of ESC differentiation into various lineages and genes that represent both generic and lineage specific processes. When applied to a prostate cancer microarray dataset, SPD identified gene modules that reflect a progression consistent with disease stages. SPD may be best viewed as a novel tool for synthesizing biological hypotheses because it provides a likely biological progression underlying a microarray dataset and, perhaps more importantly, the candidate genes that regulate that progression.
منابع مشابه
Discovering biological processes from microarray data using independent component analysis
We propose a hypothesis-free methodology for discovering genome-wide expression patterns specific to underlying biological processes from DNA microarray expression data. We apply linear and nonlinear independent component analysis (ICA) as a tool for decomposing microarray data into statistically independent components. Each component represents a gene expression pattern of a putative underlyin...
متن کاملUnsupervised Dense Regions Discovery in DNA Microarray Data
In this paper, we introduce the notion of dense regions in DNA microarray data and present algorithms for discovering them. We demonstrate that dense regions are of statistical and biological significance through experiments. A dataset containing gene expression levels of 23 primate brain samples is employed to test our algorithms. Subsets of potential genes distinguishing between species and a...
متن کاملPathogenesis of Epilepsy: Challenges in Animal Models
Epilepsy is one of the most common chronic disorders affecting individuals of all ages. A greater understanding of pathogenesis in epilepsy will likely provide the basis fundamental for development of new antiepileptic therapies that aim to prevent the epileptogenesis process or modify the progression of epilepsy in addition to treatment of epilepsy symptomatically. Therefore, severa...
متن کاملSPARSE REPRESENTATION MODELS AND APPLICATIONS TO BIOINFORMATICS by Roger Pique - Regi
Microarrays and new sequencing techniques offer a high throughput platform to study the whole genome with the unprecedented capability of measuring millions of genomic features on a single essay. This massive parallel measurement power has an enormous potential for research in Biology and Medicine with the ultimate objective of identifying and learning the biological processes occurring in diff...
متن کاملA Novel Approach for Discovering Condition-Specific Correlations of Gene Expressions within Biological Pathways by Using Cloud Computing Technology
Microarrays are widely used to assess gene expressions. Most microarray studies focus primarily on identifying differential gene expressions between conditions (e.g., cancer versus normal cells), for discovering the major factors that cause diseases. Because previous studies have not identified the correlations of differential gene expression between conditions, crucial but abnormal regulations...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 7 شماره
صفحات -
تاریخ انتشار 2011