Creating Knowledgebases to Text-Mine PUBMED Articles Using Clustering Techniques
نویسندگان
چکیده
Knowledgebase-mediated text-mining approaches work best when processing the natural language of domain-specific text. To enhance the utility of our successfully tested program-NeuroText, and to extend its methodologies to other domains, we have designed clustering algorithms, which is the principal step in automatically creating a knowledgebase. Our algorithms are designed to improve the quality of clustering by parsing the test corpus to include semantic and syntactic parsing
منابع مشابه
On expert curation and scalability: UniProtKB/Swiss-Prot as a case study
Motivation Biological knowledgebases, such as UniProtKB/Swiss-Prot, constitute an essential component of daily scientific research by offering distilled, summarized and computable knowledge extracted from the literature by expert curators. While knowledgebases play an increasingly important role in the scientific community, their ability to keep up with the growth of biomedical literature is un...
متن کاملTextrous!: Extracting Semantic Textual Meaning from Gene Sets
The un-biased and reproducible interpretation of high-content gene sets from large-scale genomic experiments is crucial to the understanding of biological themes, validation of experimental data, and the eventual development of plans for future experimentation. To derive biomedically-relevant information from simple gene lists, a mathematical association to scientific language and meaningful wo...
متن کاملA Document Clustering and Ranking System for Exploring MEDLINE Citations
Design: A text mining system framework for automatic document clustering and ranking organized MEDLINE citations following simple PubMed queries. The system grouped the retrieved citations, ranked the citations in each cluster, and generated a set of keywords and MeSH terms to describe the common theme of each cluster. Measurements: Several possible ranking functions were compared, including ci...
متن کاملارتقای کیفیت دستهبندی متون با استفاده از کمیته دستهبند دو سطحی
Nowadays, the automated text classification has witnessed special importance due to the increasing availability of documents in digital form and ensuing need to organize them. Although this problem is in the Information Retrieval (IR) field, the dominant approach is based on machine learning techniques. Approaches based on classifier committees have shown a better performance than the others. I...
متن کاملBioKI: Enzymes - an adaptable system to locate low-frequency information in full-text proteomics articles
BioKI:Enzymes is a literature navigation system that uses a two-step process. First, full-text articles are retrieved from PubMed Central (PMC). Then, for each article, the most relevant passages are identified according to a set of user selected keywords, and the articles are ranked according to the pertinence of the representative passages. In contrast to most existing systems in information ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- AMIA ... Annual Symposium proceedings. AMIA Symposium
دوره شماره
صفحات -
تاریخ انتشار 2003