Using Knowledge Sources to Improve Classification of Medical Text Reports
نویسندگان
چکیده
Domain knowledge has been shown to be an important component of machine learning. However, the cost of obtaining domain knowledge to improve classifier generation can exceed the cost of manually creating classifiers. An alternative approach is to use existing knowledge sources to collect relevant domain knowledge, and improve machine learning. We investigated the use of two existing knowledge sources (a natural language processor and controlled vocabulary metathesaurus) to improve machine learning algorithm performance in building classifiers for medical text reports. Both knowledge sources were found to significantly improve classifier performance. This demonstrates that existing knowledge sources can easily be used to improve machine learning performance.
منابع مشابه
Arabic News Articles Classification Using Vectorized-Cosine Based on Seed Documents
Besides for its own merits, text classification (TC) has become a cornerstone in many applications. Work presented here is part of and a pre-requisite for a project we have overtaken to create a corpus for the Arabic text process. It is an attempt to create modules automatically that would help speed up the process of classification for any text categorization task. It also serves as a tool for...
متن کاملTopic Modeling and Classification of Cyberspace Papers Using Text Mining
The global cyberspace networks provide individuals with platforms to can interact, exchange ideas, share information, provide social support, conduct business, create artistic media, play games, engage in political discussions, and many more. The term cyberspace has become a conventional means to describe anything associated with the Internet and the diverse Internet culture. In fact, cyberspac...
متن کاملResearch Paper: The Role of Domain Knowledge in Automating Medical Text Report Classification
OBJECTIVE To analyze the effect of expert knowledge on the inductive learning process in creating classifiers for medical text reports. DESIGN The authors converted medical text reports to a structured form through natural language processing. They then inductively created classifiers for medical text reports using varying degrees and types of expert knowledge and different inductive learning...
متن کاملUsing Fuzzy LR Numbers in Bayesian Text Classifier for Classifying Persian Text Documents
Text Classification is an important research field in information retrieval and text mining. The main task in text classification is to assign text documents in predefined categories based on documents’ contents and labeled-training samples. Since word detection is a difficult and time consuming task in Persian language, Bayesian text classifier is an appropriate approach to deal with different...
متن کاملارائه روشی برای استخراج کلمات کلیدی و وزندهی کلمات برای بهبود طبقهبندی متون فارسی
Due to ever-increasing information expansion and existing huge amount of unstructured documents, usage of keywords plays a very important role in information retrieval. Because of a manually-extraction of keywords faces various challenges, their automated extraction seems inevitable. In this research, it has been tried to use a thesaurus, (a structured word-net) to automatically extract them. A...
متن کامل