Blocking reduction strategies in hierarchical text classification
نویسندگان
چکیده
منابع مشابه
Hierarchical Text Classification and Evaluation
Hierarchical Classification refers to assigning of one or more suitable categories from a hierarchical category space to a document. While previous work in hierarchical classification focused on virtual category trees where documents are assigned only to the leaf categories, we propose a topdown level-based classification method that can classify documents to both leaf and internal categories. ...
متن کاملOn Dataless Hierarchical Text Classification
In this paper, we systematically study the problem of dataless hierarchical text classification. Unlike standard text classification schemes that rely on supervised training, dataless classification depends on understanding the labels of the sought after categories and requires no labeled data. Given a collection of text documents and a set of labels, we show that understanding the labels can b...
متن کاملHierarchical Bayes for Text Classification
Naive Bayes models have been very popular in several classification tasks. In this paper we study the application of these models to classification tasks where the data is sparse i.e., a large number of possible outcomes do not appear in the data. Traditionally point estimates of the model parameters and in particular, point estimates based on the Laplace’s rule have been popular for such spars...
متن کاملPerformance measurement framework for hierarchical text classification
Hierarchical text classification or simply hierarchical classification refers to assigning a document to one or more suitable categories from a hierarchical category space. In our literature survey, we have found that the existing hierarchical classification experiments used a variety of measures to evaluate performance. These performance measures often assume independence between categories an...
متن کاملHierarchical Discriminative Classification for Text-Based Geolocation
Text-based document geolocation is commonly rooted in language-based information retrieval techniques over geodesic grids. These methods ignore the natural hierarchy of cells in such grids and fall afoul of independence assumptions. We demonstrate the effectiveness of using logistic regression models on a hierarchy of nodes in the grid, which improves upon the state of the art accuracy by sever...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Knowledge and Data Engineering
سال: 2004
ISSN: 1041-4347
DOI: 10.1109/tkde.2004.50