Mining MEDLINE for Similar Genes and Similar Drugs
نویسندگان
چکیده
Hypothesis generation, a crucial initial step for making scientific discoveries, relies on prior knowledge, experience and intuition. Chance connections made between seemingly unrelated concepts sometimes turn out to be fruitful. A key goal in text mining is to assist in this process by automatically discovering a small set of interesting hypotheses from a suitable text collection. We focus on text mining in the biomedical domain using MEDLINE, the database produced by the National Library of Medicine with more than 12 million citations. Our overall goal is to build applications that mine MEDLINE for novel concept connections and thereby support scientists in hypothesis discovery. In this paper we first present concept profiles as a mechanism for generating concept representations from text collections. There are several advantages offered by concept profiles. They can be as current as the text database or they can be generated from temporal subsets. Profiles may be restricted to particular views and also they may be generated for concepts that are as complex as needed. We then show how concept profiles may be used to identify similar concepts. In particular, we present experiments where concept profiles are used to identify genes that are associated with the same disease and drugs that are functionally similar.
منابع مشابه
Gene regulation network fitting of genes involved in the pathophysiology of fatty liver in the mice by promoter mining
Background and Aim: Non-Alcoholic Fatty Liver Disease (NAFLD) is the major cause of chronic liver disease in developed countries. In this study, we identified the most important transcription factors and biological mechanisms affecting the incidence of fatty liver disease using the promoter region data mining. Materials and Methods In this study, at first, the marker genes associated with this...
متن کاملLearning the Structure of Biomedical Relationships from Unstructured Text
The published biomedical research literature encompasses most of our understanding of how drugs interact with gene products to produce physiological responses (phenotypes). Unfortunately, this information is distributed throughout the unstructured text of over 23 million articles. The creation of structured resources that catalog the relationships between drugs and genes would accelerate the tr...
متن کاملCorrigendum: An analysis of disease-gene relationship from Medline abstracts by DigSee
Diseases are developed by abnormal behavior of genes in biological events such as gene regulation, mutation, phosphorylation, and epigenetics and post-translational modification. Many studies of text mining attempted to identify the relationship between gene and disease by mining the literature, but they did not consider the biological events in which genes show abnormal behaviour in response t...
متن کاملGenCLiP 2.0: a web server for functional clustering of genes and construction of molecular networks based on free terms
UNLABELLED Identifying biological functions and molecular networks in a gene list and how the genes may relate to various topics is of considerable value to biomedical researchers. Here, we present a web-based text-mining server, GenCLiP 2.0, which can analyze human genes with enriched keywords and molecular interactions. Compared with other similar tools, GenCLiP 2.0 offers two unique features...
متن کاملHuman-Yeast Hybrids: New Visions to Genetic Disorders and Drug Discovery
Yeast has been a very helpful organism for centuries, especially with respect to fermentation of sugars and production of bread. However, for an even longer time, yeast has been a distant relative of humans having diverged from a common ancestor, about one billion years ago. More than one third of the yeast genes have human counterparts, despite this evolutionary distance. Yeast and human ortho...
متن کامل