COV Model and its Application in Chinese Part-of-Speech Tagging
نویسندگان
چکیده
This article presents a new sequence labeling model named Context OVerlapping (COV) model, which expands observation from single word to n-gram unit and there is an overlapping part between the neighboring units. Due to the co-occurrence constraint and transition constraint, COV model reduces the search space and improves tagging accuracy. The 2-gram COV is applied to Chinese PoS tagging and the precision rate of the open test is as high as 96.83%, which is higher than the second order HMM, which is 95.73%. The result is also comparable to the discriminative models but COV takes much less training time than them. With symbol decoding COV prunes many nodes before statistics decoding and the search space of COV is about10-20% less than that of HMM.
منابع مشابه
An improved joint model: POS tagging and dependency parsing
Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...
متن کاملHidden Markov Model and Its Application in Natural Language Processing
This paper describes Hidden Markov Model and its application in natural language process, first introduces the basic concept of Hidden Markov Model, then introduces the three basic issues and the basic algorithm to solve the problems, finally gives the demonstration of the application of Chinese part-of-speech tagging and speech recognition via Hidden Markov Model.
متن کاملبرچسبگذاری ادات سخن زبان فارسی با استفاده از مدل شبکۀ فازی
Part of speech tagging (POS tagging) is an ongoing research in natural language processing (NLP) applications. The process of classifying words into their parts of speech and labeling them accordingly is known as part-of-speech tagging, POS-tagging, or simply tagging. Parts of speech are also known as word classes or lexical categories. The purpose of POS tagging is determining the grammatical ...
متن کاملAn Enhanced Model for Chinese Word Segmentation and Part-of-Speech Tagging
This paper will present an enhanced probabilistic model for Chinese word segmentation and part-of-speech (POS) tagging. The model introduces the information of Chinese word length as one of its features to reach a more accurate result. And in addition, the model also achieves the integration of segmentation and POS tagging. After presenting the model, this paper will give a brief discussion on ...
متن کاملA pair-based language model for the robust lexical analysis in Chinese text-to-speech synthesis
This paper presents a robust method of lexical analysis for Chinese text-to-speech (TTS) synthesis using a pair-based Language Model (LM). The traditional way of Chinese lexical analysis simply regards the word segmentation and part-of-speech (POS) tagging as two separated phases. Each of them utilizes its own algorithms and models. Actually, the POS information is useful for word segmentation,...
متن کامل