Discovering significant and interpretable patterns from multifactorial DNA microarray data with poor replication
نویسندگان
چکیده
MOTIVATION Multivariate analyses are advantageous for the simultaneous testing of the separate and combined effects of many variables and of their interactions. In factorial designs with many factors and/or levels, however, sufficient replication is often prohibitively costly. Furthermore, complicated statements are often required for the biological interpretation of the higher-order interactions determined by standard statistical techniques like analysis of variance. RESULTS Because we are usually interested in finding factor-specific effects or their interactions, we assumed that the observed expression profile of a gene is a manifestation of an underlying factor-specific generative pattern (FSGP) combined with noise. Thus, a genetic algorithm was created to find the nearest FSGP for each expression profile. We then measured the distance between each profile and the corresponding nearest FSGP. Permutation testing for the distance measures successfully identified those genes with statistically significant profiles, thus yielding straightforward biological interpretations. Association networks of genes, drugs, and cell lines were created as tripartite graphs, representing significant and interpretable relations, by using a microarray experiment of gastric-cancer cell lines with a factorial design and no replication. The proposed method may benefit the combined analysis of heterogeneous expression data from the growing public repositories.
منابع مشابه
Analysis of Microarray Gene Expression Data Using Machine Learning Techniques
The advent of DNA microarrays has facilitated a fundamental transition from gene science to genome science. By performing massively parallel experiments on thousands of genes at once, scientists have, for the first time, the capability of observing the complex relationships between genes under controlled experimental conditions. However, the immense volume of data being generated by microarray ...
متن کاملA memetic algorithm for discovering negative correlation biclusters of DNA microarray data
Most biclustering algorithms for microarrays data analysis focus on positive correlations of genes. However, recent studies demonstrate that groups of biologically significant genes can show negative correlations as well. So, discovering negatively correlated patterns from microarrays data represents a real need. In this paper, we propose a Memetic Biclustering Algorithm (MBA) which is able to ...
متن کاملMethods for assessing reproducibility of clustering patterns observed in analyses of microarray data
MOTIVATION Recent technological advances such as cDNA microarray technology have made it possible to simultaneously interrogate thousands of genes in a biological specimen. A cDNA microarray experiment produces a gene expression 'profile'. Often interest lies in discovering novel subgroupings, or 'clusters', of specimens based on their profiles, for example identification of new tumor taxonomie...
متن کاملApplying Biclustering to understand the molecular basis of phenotypic diversity
High-throughput techniques, such as DNA microarrays, that are used in gene expression measurements offer a unique and global insight into the molecular mechanisms of a living cell. Computational resources are fundamental in order to extract biological interpretable information and deal with the big amount of the data extracted from these techniques. Statistical analysis of microarray data is a ...
متن کاملGlobal effects of DNA replication and DNA replication origin activity on eukaryotic gene expression
This report provides a global view of how gene expression is affected by DNA replication. We analyzed synchronized cultures of Saccharomyces cerevisiae under conditions that prevent DNA replication initiation without delaying cell cycle progression. We use a higher-order singular value decomposition to integrate the global mRNA expression measured in the multiple time courses, detect and remove...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of biomedical informatics
دوره 37 4 شماره
صفحات -
تاریخ انتشار 2004