Boosting for Text Classification with Subject Headings

نویسندگان

  • Kwan Yi
  • Jamshid Beheshti
چکیده

s: The aim of this study is to investigate how Medical Subject Headings (MeSH) as background knowledge source can improve text classification results. The hypothesis is experimented with two different sets of medical documents using HMM-based TC classifier. Experimental results show the improvement of the performance with MeSH in accuracy. Résumé : Le but de cette étude est d’examiner comment les vedettes-matière médicales (MeSH) en tant que source de connaissances peuvent améliorer les résultats de la classification de textes. L’hypothèse est vérifiée à l’aide de deux différents ensembles de documents médicaux utilisant la classification textuelle basée sur le MCM. Les résultats de cette expérience montrent une amélioration de la performance de précision avec MeSH.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Full-texts representation with Medical Subject Headings, and co-citations network rerank- ing strategies for TREC 2014 Clinical Decision Support Track

In TREC 2014 Clinical Decision Support Track, the task was to retrieve full-texts relevant for answering generic clinical questions about medical records. For this purpose, we investigated a large range of strategies in the five runs we officially submitted. Concerning Information Retrieval (IR), we tested two different indexing levels: documents or sections. Section indexing was clearly below ...

متن کامل

Combining Active Learning and Boosting for Naïve Bayes Text Classifiers

This paper presents a variant of the AdaBoost algorithm for boosting Näıve Bayes text classifier, called AdaBUS, which combines active learning with boosting algorithm. Boosting has been evaluated to effectively improve the accuracy of machine-learning based classifiers. However, Näıve Bayes classifier, which is remarkably successful in practice for text classification problems, is known not to...

متن کامل

BoosTexter : A Boosting - based System for Text Categorization

This work focuses on algorithms which learn from examples to perform multiclass text and speech categorization tasks. Our approach is based on a new and improved family of boosting algorithms. We describe in detail an implementation, called BoosTexter, of the new boosting algorithms for text categorization tasks. We present results comparing the performance of BoosTexter and a number of other t...

متن کامل

A Boosting Algorithm for Classification of Semi-Structured Text

The focus of research in text classification has expanded from simple topic identification to more challenging tasks such as opinion/modality identification. Unfortunately, the latter goals exceed the ability of the traditional bag-of-word representation approach, and a richer, more structural representation is required. Accordingly, learning algorithms must be created that can handle the struc...

متن کامل

TreeBoost.MH: A Boosting Algorithm for Multi-label Hierarchical Text Categorization

Hierarchical Text Categorization (HTC) is the task of generating (usually by means of supervised learning algorithms) text classifiers that operate on hierarchically structured classification schemes. Notwithstanding the fact that most largesized classification schemes for text have a hierarchical structure, so far the attention of text classification researchers has mostly focused on algorithm...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006