Collocational Properties in Probabilistic Classi ers for Discourse Categorization
نویسندگان
چکیده
Properties can be mapped to features in a machine learning algorithm in diierent ways, potentially yielding diierent results. In previous work, we experimented with various approaches to organizing colloca-tional properties into features in a probabilistic classi-er. It was found that one type of organization in particular , which is rarely used in NLP, allows one to take advantage of infrequent but high quality properties for an abstract discourse interpretation task. Based on an analysis of the experimental results, this paper suggests criteria for recognizing beneecial ways to include collocational information in probabilistic classiiers.
منابع مشابه
Collocational Properties in Probabilistic Classifiers for Discourse Categorization
Properties can be mapped to features in a machine learning algorithm in different ways, potentially yielding different results. In previous work, we experimented with various approaches to organizing collocational properties into features in a probabilistic classifier. It was found that one type of organization in particular, which is rarely used in NLP, allows one to take advantage of infreque...
متن کاملTitle: Hierarchically Classifying Documents Using Very Few Words Authors: Hierarchically Classifying Documents Using Very Few Words
The proliferation of topic hierarchies for text documents has resulted in a need for tools that automatically classify new documents within such hierarchies. One can use existing classi ers by ignoring the hierarchical structure, treating the topics as separate classes. Unfortunately, in the context of text categorization, we are faced with a large number of classes and a huge number of relevan...
متن کاملAAAI 1995 Spring Symposium on Empirical Methods in Discourse Interpretation and Generation Learning Domain-Speci c Discourse Rules for Information Extraction
This paper describes a system that learns discourse rules for domain-speci c analysis of unrestricted text. The goal of discourse analysis in this context is to transform locally identi ed references to relevant information in the text into a coherent representation of the entire text. This involves a complex series of decisions about merging coreferential objects, ltering out irrelevant inform...
متن کاملMapping Collocational Properties into Machine Learning Features
This paper investigates interactions between collocational properties and methods for organizing them into features for machine learning. In experiments performing an event categorization task, Wiebe et al. (1997a) found that different organizations are best for different properties. This paper presents a statistical analysis of the results across different machine learning algorithms. In the e...
متن کاملProbabilistic Event Categorization
This paper describes the automation of a new text categorization task. The categories assigned in this task are more syntactically, semantically, and contextually complex than those typically assigned by fully automatic systems that process unseen test data. Our system for assigning these categories uses a probabilistic classifier, developed with a recent method for formulating a probabilistic ...
متن کامل