Gaining confidence in biological interpretation of the microarray data: the functional consistence of the significant GO categories

نویسندگان

  • Da Yang
  • Yanhui Li
  • Hui Xiao
  • Qing Liu
  • Min Zhang
  • Jing Zhu
  • Wencai Ma
  • Chen Yao
  • Jing Wang
  • Dong Wang
  • Zheng Guo
  • Baofeng Yang
چکیده

MOTIVATION In microarray studies, numerous tools are available for functional enrichment analysis based on GO categories. Most of these tools, due to their requirement of a prior threshold for designating genes as differentially expressed genes (DEGs), are categorized as threshold-dependent methods that often suffer from a major criticism on their changing results with different thresholds. RESULTS In the present article, by considering the inherent correlation structure of the GO categories, a continuous measure based on semantic similarity of GO categories is proposed to investigate the functional consistence (or stability) of threshold-dependent methods. The results from several datasets show when simply counting overlapping categories between two groups, the significant category groups selected under different DEG thresholds are seemingly very different. However, based on the semantic similarity measure proposed in this article, the results are rather functionally consistent for a wide range of DEG thresholds. Moreover, we find that the functional consistence of gene lists ranked by SAM metric behaves relatively robust against changing DEG thresholds. AVAILABILITY Source code in R is available on request from the authors.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using the Protein-protein Interaction Network to Identifying the Biomarkers in Evolution of the Oocyte

Background Oocyte maturity includes nuclear and cytoplasmic maturity, both of which are important for embryo fertilization. The development of oocyte is not limited to the period of follicular growth, and starts from the embryonic period and continues throughout life. In this study, for the purpose of evaluating the effect of the FSH hormone on the expression of genes, GEO access codes for this...

متن کامل

CLENCH: a program for calculating Cluster ENriCHment using the Gene Ontology

SUMMARY Analysis of microarray data most often produces lists of genes with similar expression patterns, which are then subdivided into functional categories for biological interpretation. Such functional categorization is most commonly accomplished using Gene Ontology (GO) categories. Although there are several programs that identify and analyze functional categories for human, mouse and yeast...

متن کامل

Identification of Alzheimer disease-relevant genes using a novel hybrid method

Identifying genes underlying complex diseases/traits that generally involve multiple etiological mechanisms and contributing genes is difficult. Although microarray technology has enabled researchers to investigate gene expression changes, but identifying pathobiologically relevant genes remains a challenge. To address this challenge, we apply a new method for selecting the disease-relevant gen...

متن کامل

Extracellular exosomes and preeclampsia: a microarray-based study and functional enrichment analysis

Background:  Preeclampsia (PE) is a heterogeneous pregnancy disease which the exact pathophysiology of it is unknown. Recently exosomes have been indicated as a causative factor in the pathogenesis of PE. The aim of the study was to investigate in microarray library data to extract the differentially expressed genes (DEGs) in PE and to perform a functional enrichment analysis to predict the rol...

متن کامل

The False Discovery Rate in Simultaneous Fisher and Adjusted Permutation Hypothesis Testing on Microarray Data

Background and Objectives: In recent years, new technologies have led to produce a large amount of data and in the field of biology, microarray technology has also dramatically developed. Meanwhile, the Fisher test is used to compare the control group with two or more experimental groups and also to detect the differentially expressed genes. In this study, the false discovery rate was investiga...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 24 2  شماره 

صفحات  -

تاریخ انتشار 2008