Gene Ontology Annotation Using Word Proximity Relationship

نویسندگان

  • Kevin Hsin-Yih Lin
  • Wen-Juan Hou
  • Hsin-Hsi Chen
چکیده

In this paper, we propose an approach for doing Gene Ontology (GO) annotation on full-text biomedical articles. This system explores the word proximity relationship between genes and GO terms. We associate genes and GO terms by considering the density function between gene-GO pairs in a paragraph. Different density models are built and several evaluation criteria are employed to assess the effects of the proposed methods. In the best case, we got a precision of < 88% and a recall of < 12%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Random Forest proximity matrix as a new measure for gene annotation

In this paper we present a new score for gene annotation. This new score is based on the proximity matrix obtained from a trained Random Forest (RF) model. As an example application, we built this model using the association pvalues of genotype with blood phenotype as input and the association of genotype data with coronary heart disease as output. This new score has been validated by comparing...

متن کامل

Gene ontology annotation by density and gravitation models.

Gene Ontology (GO) is developed to provide standard vocabularies of gene products in different databases. The process of annotating GO terms to genes requires curators to read through lengthy articles. Methods for speeding up or automating the annotation process are thus of great importance. We propose a GO annotation approach using full-text biomedical documents for directing more relevant pap...

متن کامل

Defining functional distance using manifold embeddings of gene ontology annotations.

Although rigorous measures of similarity for sequence and structure are now well established, the problem of defining functional relationships has been particularly daunting. Here, we present several manifold embedding techniques to compute distances between Gene Ontology (GO) functional annotations and consequently estimate functional distances between protein domains. To evaluate accuracy, we...

متن کامل

Combining Evidence, Specificity, and Proximity towards the Normalization of Gene Ontology Terms in Text

Structured information provided by manual annotation of proteins with Gene Ontology concepts represents a high-quality reliable data source for the research community. However, a limited scope of proteins is annotated due to the amount of human resources required to fully annotate each individual gene product from the literature. We introduce a novel method for automatic identification of GO te...

متن کامل

Identifying informative subsets of the Gene Ontology with information bottleneck methods

MOTIVATION The Gene Ontology (GO) is a controlled vocabulary designed to represent the biological concepts pertaining to gene products. This study investigates the methods for identifying informative subsets of GO terms in an automatic and objective fashion. This task in turn requires addressing the following issues: how to represent the semantic context of GO terms, what metrics are suitable f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006