Selective Classifiers for Part-of-Speech Tagging
نویسندگان
چکیده
We investigate the use of selective classifiers for part-of-speech tagging (POS). The idea is to allow classifiers to abstain on hard instances, passing them to downstream classifiers that may have more context available. In this report we focus on just the first stage of such a cascade, and ask whether selective classifiers attain the accuracies needed on those instances they accept, given that such instances will not be revisited by downstream processing. We show that a selective classifier that is constructed as an abstaining committee of two off-the-shelf POS taggers can indeed achieve very high accuracies with modest drops in coverage. We also compute the overall accuracy when all instances are voted on by applying majority vote to the abstentions, and we find that this results in state of the art accuracies, robustly.
منابع مشابه
An improved joint model: POS tagging and dependency parsing
Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...
متن کاملسیستم برچسب گذاری اجزای واژگانی کلام در زبان فارسی
Abstract: Part-Of-Speech (POS) tagging is essential work for many models and methods in other areas in natural language processing such as machine translation, spell checker, text-to-speech, automatic speech recognition, etc. So far, high accurate POS taggers have been created in many languages. In this paper, we focus on POS tagging in the Persian language. Because of problems in Persian POS t...
متن کاملArabic Tokenization, Part-of-Speech Tagging and Morphological Disambiguation in One Fell Swoop
We present an approach to using a morphological analyzer for tokenizing and morphologically tagging (including partof-speech tagging) Arabic words in one process. We learn classifiers for individual morphological features, as well as ways of using these classifiers to choose among entries from the output of the analyzer. We obtain accuracy rates on all tasks in the
متن کاملArabic Part of speech Tagging using k-Nearest Neighbour and Naive Bayes Classifiers Combination
Part Of Speech (POS) tagging forms the important preprocessing step in many of the natural language processing applications such as text summarization, question answering and information retrieval system. It is the process of classifying every word in a given context to its appropriate part of speech. Different POS tagging techniques in the literature have been developed and experimented. Curre...
متن کاملData-Driven Part-of-Speech Tagging of Kiswahili
In this paper we present experiments with data-driven part-of-speech taggers trained and evaluated on the annotated Helsinki Corpus of Swahili. Using four of the current state-of-the-art data-driven taggers, TnT, MBT, SVMTool and MXPOST, we observe the latter as being the most accurate tagger for the Kiswahili dataset.We further improve on the performance of the individual taggers by combining ...
متن کامل