Benefits of Associative Classification within Text Categorisation
نویسندگان
چکیده
Associative Classification has been successfully employed in many diverse classification problem domains, showing high classification accuracy and adequate computation time relative to the other traditionally used solutions. Despite this, very little research has been conducted with it in the problem area of Text Categorisation and only a small number of approaches presently exist that are based on the concept. This paper aims to highlight the main characteristics of general Text Categorisation problems, provide an overview of the principal drawbacks associated with traditionally employed techniques and outline the benefits of utilising Associative Classification methods as a replacement. The potential disadvantages of the approach are also considered and a range of examples is included for each section in order to present a balanced representation that is unbiased.
منابع مشابه
Evaluation of Feature Combination Approaches for Text Categorisation
Text categorisation relies heavily on feature selection. Both the possible reduction in dimensionality as well as improvements in classification performance are highly desirable. To the end of feature selection for text, a range of different methods have been developed, each having unique properties and selecting different features. However, it remains unclear which of them can be combined and ...
متن کاملA Tutorial on Automated Text Categorisation
The automated categorisation (or classification) of texts into topical categories has a long his-tory, dating back at least to 1960. Until the late ’80s, the dominant approach to the probleminvolved knowledge-engineering automatic categorisers, i.e. manually building a set of rulesencoding expert knowledge on how to classify documents. In the ’90s, with the booming pro-duction a...
متن کاملA practical implementation of automatic text categorisation and correction for the conversion of noisy OCR documents into braille and large print
A novel text categorisation method called Cmeasure is applied to the problem of automatically correcting standard blocks of noisy OCR text within structured documents such as credit card statements and standardised letters. The blocks of text in the scanned image are first identified then classified using the C-Measure algorithm against a small set of known correct text. The text block is subse...
متن کاملImproving Biomedical Text Categorisation with NLP
Background: Text categorisation has been used in bioinformatics to help identify documents containing protein-protein interactions. Standard text categorisation methods have used the bag-of-words approach with little input from NLP. While this has proved effective in the past, there is some evidence that the techniques are not adequate in some biological domains. Here we examine how chunking, n...
متن کاملPractical Application of Associative Classifier for Document Classification
In practical text classification tasks, the ability to interpret the classification result is as important as the ability to classify exactly. The associative classifier has favorable characteristics, rapid training, good classification accuracy, and excellent interpretation. However, the associative classifier has some obstacles to overcome when it is applied in the area of text classification...
متن کامل