Mining and its Application in Biomedical Domain

نویسندگان

  • Illhoi Yoo
  • Xia Lin
  • Bahrad A. Sokhansanj
  • Don Goelman
  • TaeWhan Jung
  • YoungJae Jung
چکیده

Semantic Text Mining and its Application in Biomedical Domain Illhoi Yoo Xiaohua Hu, Ph.D A huge amount of biomedical knowledge and novel discoveries have been produced and collected in text databases or digital libraries, such as MEDLINE, because the most natural form to store information is text. In order to cope with this pressing text information overload, text mining is employed. However, traditional text mining approaches have several problems, such as the use of the vector representation for documents. In this thesis, we introduce a semantic text mining approach that can overcome the traditional problems. This approach consists of important text mining components. Those components are graphical representation method for documents that relies on domain ontologies, document clustering taking advantage of the scale-free network theory to mine the corpus-level graphical representation, text summarization, and a semantic version of Swanson’s ABC model. The primary contributions of this dissertation are four-fold. First we introduce graphical representation method for documents that take advantage of domain ontology. Second, the semantic document clustering approach is unique in that it provides users with document cluster models from an ontology-enriched scale-free representation of a set of documents, which are the summaries for each document cluster, and which also explain document categorization. Third, in order to maximize the usefulness of document clustering, we introduce a text summarization approach that makes use of document cluster models. Finally, we introduce a semantic way to generate reasonable hypotheses based on evidence from biomedical literature using the complementary structures in disjoint literatures.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

BIOTEX: A system for Biomedical Terminology Extraction, Ranking, and Validation

Term extraction is an essential task in domain knowledge acquisition. Although hundreds of terminologies and ontologies exist in the biomedical domain, the language evolves faster than our ability to formalize and catalog it. We may be interested in the terms and words explicitly used in our corpus in order to index or mine this corpus or just to enrich currently available terminologies and ont...

متن کامل

A Proposed Data Mining Methodology and its Application to Industrial Procedures

Data mining is the process of discovering correlations, patterns, trends or relationships by searching through a large amount of data stored in repositories, corporate databases, and data warehouses. Industrial procedures with the help of engineers, managers, and other specialists, comprise a broad field and have many tools and techniques in their problem-solving arsenal. The purpose of this st...

متن کامل

BeCAS: biomedical concept recognition services and visualization

SUMMARY The continuous growth of the biomedical scientific literature has been motivating the development of text-mining tools able to efficiently process all this information. Although numerous domain-specific solutions are available, there is no web-based concept-recognition system that combines the ability to select multiple concept types to annotate, to reference external databases and to a...

متن کامل

On Computationally-Enhanced Visual Analysis of Heterogeneous Data and Its Application in Biomedical Informatics

With the advance of new data acquisition and generation technologies, the biomedical domain is becoming increasingly data-driven. Thus, understanding the information in large and complex data sets has been in the focus of several research fields such as statistics, data mining, machine learning, and visualization. While the first three fields predominantly rely on computational power, visualiza...

متن کامل

Application of Single-Frequency Time-Space Filtering Technique for Seismic Ground Roll and Random Noise Attenuation

Time-frequency filtering is an acceptable technique for attenuating noise in 2-D (time-space) and 3-D (time-space-space) reflection seismic data. The common approach for this purpose is transforming each seismic signal from 1-D time domain to a 2-D time-frequency domain and then denoising the signal by a designed filter and finally transforming back the filtered signal to original time domain. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006