Sifting abstracts from Medline and evaluating their relevance to molecular biology
نویسندگان
چکیده
The most important knowledge in the area of biology currently consists of raw text documents. Bibliographic databases of biomedical articles can be searched, but an efficient procedure should evaluate the relevance of documents to biology. In genetics, this challenge is even trickier, because of the lack of consistency in genes' naming tradition. We aim to define a good approach for collecting relevant abstracts for biology and for studied species and genes. Our approach relies on defining best queries, detecting and filtering best sources.
منابع مشابه
Ontology Based Corpus Annotation and Tools
With the explosion of results in molecular biology there is an increased need for IE to extract knowledge to support database building and to search intelligently for information in online journal collections. We aim to build information extraction systems from biology papers and their abstracts available from the MEDLINE database[1, 3]. As a part of a project on information extraction from the...
متن کاملTowards Retrieving Relevant Information for Answering Clinical Comparison Questions
This paper introduces the task of automatically answering clinical comparison questions using MEDLINE abstracts. In the beginning, clinical comparison questions and the main challenges in recognising and extracting their components are described. Then, different strategies for retrieving MEDLINE abstracts are shown. Finally, the results of an initial experiment judging the relevance of MEDLINE ...
متن کاملPart-of-Speech Tagging in Molecular Biology Scientific Abstracts Using Morphological and Contextual Statistical Information
In this paper a probabilistic tagger for molecular biology related abstracts is presented and evaluated. The system consists of three modules: a rule based molecular-biology names detector, an unknown words handler, and a Hidden Markov model based tagger which are used to annotate the corpus with an extended set of grammatical and molecular biology tags. The complete system has been evaluated u...
متن کاملMining molecular binding terminology from biomedical text
Automatic access to information regarding macromolecular binding relationships would provide a valuable resource to the biomedical community. We report on a pilot project to mine such information from the molecular biology literature. The program being developed takes advantage of natural language processing techniques and is supported by two repositories of biomolecular knowledge. A formative ...
متن کاملExtracting the Names of Genes and Gene Products with a Hidden Markov Model
We report the results of a study into the use of a linear interpolating hidden Markov model (HMM) for the task of extracting technical terminology from MEDLINE abstracts and texts in the molecular-biology domain. This is the rst stage in a system that will extract event information for automatically updating biology databases. We trained the HMM entirely with bigrams based on lexical and charac...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Studies in health technology and informatics
دوره 124 شماره
صفحات -
تاریخ انتشار 2006