Recent Advances in Memory-Based Part-of-Speech Tagging
نویسنده
چکیده
Memory based learning algorithms are lazy learners Examples of a task are stored in memory and processing is largely postponed to the time when new instances of the task need to be solved This is then done by extrapolating directly from those remem bered instances which are most similar to the present ones Using memory based learning for Part of Speech tagging has a number of advantages over traditional statistical POS taggers i there is no need for an additional smoothing component for sparse data ii even low frequent or exceptional patterns can contribute to generalization iii the use of a weighted similarity metric allows for an easy integration of di erent information sources and iv both development time and processing speed are very fast in the or der of hours and thousands of words sec respectively In recent work we have applied the Memory Based tagger MBT to a number of di erent languages and corpora En glish Dutch Czech Swedish and Spanish Furthermore we have performed a controlled experimental comparison of MBT with several other POS tagging algorithms
منابع مشابه
برچسبگذاری ادات سخن زبان فارسی با استفاده از مدل شبکۀ فازی
Part of speech tagging (POS tagging) is an ongoing research in natural language processing (NLP) applications. The process of classifying words into their parts of speech and labeling them accordingly is known as part-of-speech tagging, POS-tagging, or simply tagging. Parts of speech are also known as word classes or lexical categories. The purpose of POS tagging is determining the grammatical ...
متن کاملAn improved joint model: POS tagging and dependency parsing
Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...
متن کاملسیستم برچسب گذاری اجزای واژگانی کلام در زبان فارسی
Abstract: Part-Of-Speech (POS) tagging is essential work for many models and methods in other areas in natural language processing such as machine translation, spell checker, text-to-speech, automatic speech recognition, etc. So far, high accurate POS taggers have been created in many languages. In this paper, we focus on POS tagging in the Persian language. Because of problems in Persian POS t...
متن کاملResource-Light Bantu Part-of-Speech Tagging
Recent scientific publications on data-driven part-of-speech tagging of Sub-Saharan African languages have reported encouraging accuracy scores, using off-the-shelf tools and often fairly limited amounts of training data. Unfortunately, no research efforts exist that explore which type of linguistic features contribute to accurate part-of-speech tagging for the languages under investigation. Th...
متن کاملRapid Development of Nlp Modules with Memory-based Learning
The need for software modules performing natural language processing (NLP) tasks is growing. These modules should perform efficiently and accurately, while at the same time rapid development is often mandatory. Recent work has indicated that machine learning techniques in general, and memory-based learning (MBL) in particular, offer the tools to meet both ends. We present examples of modules tr...
متن کامل